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O ■ ABSTRACT 

The dispersed fixed-delay interferometer (DFDI) represents a new instrument con- 
cept for high-precision radial velocity (RV) surveys for extrasolar planets. A combina- 
tion of Michelson interferometer and medium-resolution spectrograph, it has the poten- 
tial for performing multi-object surveys, where most previous RV techniques have been 
limited to observing only one target at a time. Because of the large sample of extraso- 
r^ . lar planets needed to better understand planetary formation, evolution, and prevalence, 

II , this new technique represents a logical next step in instrumentation for RV extrasolar 

o 

H 

t^ ! Kitt Peak National Observatory, and the multi-object W. M. Keck/MARVELS Exo- 

^ ' planet Tracker at Apache Point Observatory. The development of the ET instruments 

has necessitated fleshing out a detailed understanding of the physical principles of the 

^ ' DFDI technique. Here we summarize the fundamental theoretical material needed to 

"^ . understand the technique and provide an overview of the physics underlying the instru- 

ly^ , ment's working. We also derive some useful analytical formulae that can be used to 

"^ I estimate the level of various sources of error generic to the technique, such as photon 

1;^ I shot noise when using a fiducial reference spectrum, contamination by secondary spec- 

Q I tra (e.g., crowded sources, spectroscopic binaries, or moonlight contamination), residual 

interferometer comb, and reference cross-talk error. Following this, we show that the 

use of a traditional gas absorption fiducial reference with a DFDI can incur significant 

S^ . systematic errors that must be taken into account at the precision levels required to 

^ . detect extrasolar planets. 
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1. The DFDI Concept and the ET Program 

1.1. The Need for a New Instrument 

Despite the remarkable achievements in extrasolar planet detection over the last decade, iden- 
tification of many more planets is still needed to constrain formation and evolutionary models. 
This is partially because of the unexpected diversity of planet properties uncovered, and partially 
because of a lack of large, well-defined, unbiased target search lists - the primary concern naturally 
having been to find planets in the first place. To this point many surveys have been subject to corn - 
pleteness issues or in some cases deliberate biases toward planet detection (e.g. Ida Silva et al.ll2006l'). 
makin g it difficult to perform robust statistical analyses of the known planet sample. lArmitage 
(J2007l ) concluded that there is still a strong need for large uniform surveys to enlarge the statistical 
sample available: drawing on the unbiased survey of lFischer fc Valentil (J2005l ). he was only able to 
find a uniform subsample of 22 of the over 170 planets then known that satisfied the requirements 
for a statistical comparison with models. 

A few thousand stars have been searched between the va rious RV surveys since the first RV 
discoveries of giant extrasolar planets around solar- type stars (JMavor &: Queloall995l ). including a 
large fraction of the late-type, stable stars down to visual magnitude ~ 8. Improved instrument 
light throughput woul d help facilitate t he survey of fainter stars. (A review of radial velocity (RV) 
discoveries is given by lUdry et al.l 120071 ) . Although the rate of detections from transit surveys will 
likely increase, transit surveys can only detect the small fraction of planets which happ en to eclipse 



their parent stars (~ 10% probability for hot Jupiters, from geometrical considerations - iKane et al. 
20041 ). Furthermore, the complementary information gained from RV detections remains of great 
value. There is therefore a strong case for finding a technique capable of RV surveys down to faint 
magnitudes and at faster speeds than have been achieved over the last decade. The Exoplanet 
Tracker (ET) instruments are a new type of fiber-fed radial velocity (RV) instrument based on the 
'dispersed fixed-delay interferometer' (DFDI), built with the goal of satisfying this requirement. 



1.2. The DFDI Principle 

The radial velocity technique for detecting exoplanets consists in measuring the reflex motion 
of the parent star due to an orbiting planet by measuring very precisely the resulting Doppler shifts 
of the stellar absorption lines. Achieving this re quires extrernely high precision: inte rnal precisions 
now typically reach dow n to the 3ms~^ level (JButler et al.l Il99a : IVogt et al.l l2000l ) , and even as 
low as Ims^^ or better ( Pepe et al.ll2005l ). For comparison, a Jupiter analogue in a circular orbit 



around a solar-type star would cause sinusoidal radial velocity variations with an amplitude of 
about 12.5 ms~^. Exoplanet radial velocity surveys have traditionally depended on recording very 
high resolution echelle spectra, either cross correlating the spectra with reference template spectra, 
or fitting functions to the line profiles themselves to measure the positions of the centroids. 

The DFDI technique, upon which the Exoplanet Tracker (ET) instruments are based, comprises 
a Miche l son in terferometer followed by a low or medium resolution post-disperser (also referred to by 
Erskind (|2003l ) as an externally dispersed interferometer, or 'EDI', emphasizing the distinction from 
techniques where the dispersing element is internal to the interferometer). The effective resolution 
of the instrument is determined primarily by the interferometer, so the post-dispersing spectrograph 
can be of much lower resolution than in tradition al disper s ive techniques, a nd consequently can 
be smaller, cheaper, and have higher throughput (JGell2002l : iGe et al.ll2003al ]bl). The technique is 
closely related to Fourier transform spectroscopy: the post-disperser effectively creates a continuum 
of very narrow bandpasses for the interferometer, increasing the interference fringe contrast. All 
the information needed is contained in the fringe phase and visibility. It emerges that since we are 
only interested in the Doppler shift of the lines, measurements are required at essentially only one 
value of interferometer delay (hence 'fixed delay'). 

The cost of the instrument is comparatively low, and most importantly, it can operate in a 
single-order mode: where traditional echelle spectrograph techniques operate by spreading a single 
stellar spectrum over an entire CCD detector in multiple orders, here the spectrum only takes up 
one strip along the detector. Spectra from multiple stars can therefore be lined up at once on a 
single detector. In combin ation with a wide field multi-fiber telescope, thi s makes multi-object RV 
planet surveying possible (JGell2002l : iGe et al.ll2002l : iMahadevan et al.ll2003l ). The multi-object Keck 



ET instr ument ba,sed on the DFDI technique is one of the first instruments to be built with this 
purpose (JGe et ahlboogl lFl 



The very high levels of precision required for planet detection and the difficulty of directly 
measuring absolute wavelengths mean that some kind of stationary reference spectrum is invariably 
used as a cali bration. Various types of fiducial reference have been employed to overcome these 
problems (e.g. iGriffin fc Griffinlll973l : ICairiDbell fc Walkerl 19791) . but the references of choice have 



generally become ThAr emission la mps (IBaranne et al. 



19961 ) and iodine vapor absorption cells 



placed within the optical beam path (JButler et al.lll996l ). In this respect, the ET instruments are 
the same, and we discuss the use of such references with the DFDI technique in this paper. 



^Comp arable traditional di spersive multi-object instrum ents are the VLT GIRAFFE and UVES/FLAMES spec- 
trographs JLoeillet et allboosl ). and the MMT Hectochelle (jSzentgyorgyi fc Fures^boOTJ ') 



1.3. A Brief History 



The idea of using the combination of a Michelson interferometer with a postdisperser was first 
proposed for precisio n Doppler planet searches by D. J. Erskine in 1997, a,t Law rence Livermore 



National Laboratory (lErskine fc Ge 



2000 



Ge 



2002 



Ge et alJl2002l : lErskinell2003l ). The same ap- 



proach is being followed by lEdelstein et alJ ([200a) in the i nfra-red, in an attem pt to find planets 
around late- type stars. A similar approach is discussed by iMosser et alJ (J2003l ) for asteroseismol- 
ogy and the measurement of stellar oscillations; more recently the technique ha s also been adopted 
for the USNQ D ispersed Fourier Transform Spectrograph (dFTS) instrument (JHajian et alJ 120071 : 
Behr et alJ l2009l ) (in this last case, the interferometer delay is also varied so that high resolu- 
tion spectra can be recons tructed in addition to extracting Doppler shift information - see also 
Erskine k EdelsteinI (|2004l )). 



The idea of dispersed interferometr y itself is by no ra eans new: Michelson himself recognized the 
use of interferometers for spectroscopy (JMichelsonl 1 1 903l ) . and even proposed combining a disperser 
in series with a Michelson interferometer. In this case the disperser, a prism, was placed before the 
interferometer, allowing only a narrow bandwidth of ligh t to enter the in t erfero meter in the first 
place. In what was likely the first realization of a DFDI, lEdser fc Butleij (| 18981 ) placed a Fabry- 
Perot type interferometer in front of a spectrograpfo to produce dispersed fringes (effectively an 
interferometer comb - see section 12. 4p , which they used as a fiducial reference for measuring the 
wavelen gths of spectra l lines. Such dispersed fringes were later to become known as 'Edser-Butler 
fringes' (|LawsorJl2000l ). 



Somewhat later, along with the development of P. Co nnes' SISAM ( "s pectrometre interferentiel 
a selection par I'amplitude de modulation," described in |Jacquinotlll96d ). various combinations of 
interferomet e rs wi th dispersers began to be seen in the field of astronomy. Examples include 
Geake et al.l (|l959l ). using a Fabry-Perot in fr ont of a spectrograph t o increase throughput; and the 
later SHS ( "spatial het erodyne spectroscopy," iHarlander et al.lll992l ') and HHS ( "heterodyned holo- 
graphic spectroscopy," iFrandsen et al.l 119931 : iDouglad 119971 1 techniques, usin g internally dispersed 



i nterfe rometers , where the interferometer mirrors were replaced with gratings. [Barker &: Hollenbach 



(|l972l ) outlined an early example of the use of true fixed-delay interferometry for velocimetry, 
measuring the velocities of laser-illuminated projectiles in the laboratory. The use of a Michel- 
son interferometer f or ac tual a.stronomical RV measur en ients was propo s ed sh ortly afterward by 
Gorskii fc LebedevI (jl977l ) and iBeckers &: Brownl (jl978l ) . iForrest Sz Ring! (jl978l ) also proposed us- 
ing a Michelson interferometer with a fixed delay for high-precision Doppler measurements of single 
spectral lines for the dete ction of stellar oscil lations, and mor e recent exa. mples of similar spectro- 
scopic techniques include IConnesI (|l985l ) and [McMillan et al.l (jl993l . 11994 ) . Others have also used 
similar techniques for Doppler imaging over very narrow bandpasses, notably the WAMDII (wide- 



^It was mistakenly stated in Ivan Evken et al.l ( 2004al ) that lEdser fc Butlej ( 18981 ) used a Michelson rather than a 
Fabry-Perot interferometer, which has certain disadvantages in this apphcation (D. J. Erskine 2005, private commu- 
nication). 



angle Doppler imaging interferometer) and GONG (Global Oscillat ion Network Group) projects 
(jShepherd et al.1 llQSsI : iHarvev fc The GONG Instrument Tea J llQQsl ) . 



Many of these interferometric instruments, however, suffered from the limitation of having an 
extremely narrow bandpass, tending to limit their application to only bright targets. The DFDI 
technique used in the ET instruments allows for an arbitrarily wide bandpass, limited only by the 
spectrograph capabilities, while still retaining the high resolution spectral information needed for 
precision velocity measurements. The first such DFDI instruments were built at the Lawrence Liv- 
er more National Labor atory and the Lick Im telescope between 1997 and 1999, and were reported 
in 



Erskine Sz Gd (|200d ) and lGe et al.l (|2002l ). The ET project was undertaken shortly after. 



1.4. The ET Project 

The ET project began at Penn State University in 2000, continuing at the University of Florida 
from 2004. Early lab tests were performed at Penn State, and prototype test runs were conducted 
at the McDonald Obser vatory Hobby-Eberly Telescope iii late 2001, and at the Palomar 200 inch 
telescope in early 2002 (JGe et al.ll2003bl : iMahadevanlbood ). 



T wo ET instruments have now been built: the single-object prototype ET f|van Evken et al 



2004bl : iMahadevan et al.ll2008al ). permanently installed at the KPNO 2.1m telescope in 2003 after a 
temporary test run in August 2002; and the multi-object Keck ET, first installed at the APO Sloan 
2.5m telescope in March 2005, upgraded and moved to a more stable location at the same telescope 
later that year, and then further upgraded and fully installed as facility instrument housed in its 
own custom-built room in September 2008. The latter instrument will function as the workhorse 
for the SPSS III "Multi-object APO Radial Velocity Exoplanet Large-area Survey" (MARVELS; 
Ge et al.l[2009l ). 



Proof of concept was achieved using the KPNO ET with the first DFD I planet detection, 
a confirmation of the known companion to 51 Pegasi (Ivan Eyken et al.ll2004al). Our first planet 
discovery, HD 102195b (ET-1), was also later made using this instrument (jGe et al.ll2006l ). The 
multi-object Keck ET is a full scale instrument developed to satisfy the survey requirements laid 
out in section [TTl and it is anticipated that it will be ab le to make a sig nificant contribution to the 
field of extrasolar planet searches over the next decade (|Ge et al.ll2009l ). 



2. Instrument Principles and Theory 

Although various forms of the DFDI have been employed before, the concept, particularly in 
its specific application to exoplanet finding, is rather new. Much of the work in understanding the 
data from the instrument has therefore involved coming to a full understanding of the physics of 
the instrument itself. Related theory is discussed in a number of sources (for example iGoodman 



19851: lErskine fc Gdl200d : lLawsonll200d : Icdbooi ICe et al.ll2002l : lErskindl2003l : iMosser et al.ll2003l : 
van Evken et alj 120031 ): an attempt is made here to draw together, expand on, and precisely state 
the theoretical material needed for a complete understanding of the instrument, and to provide 
an overview of the physics underlying the instrument's working from the perspective of precision 
RV planet detection. The approach taken here allows for some important insights, particularly 
regarding certain errors arising from the use of a common-path fiducial reference spectrum such 
as that from an iodine gas absorption cell. In addition we derive in section [3] some useful general 
formulae that can be applied to estimate analytically the magnitude of both these and a number 
of other types of error generic to the technique. 

Taken together, this discussion should provide some of the fundamentals necessary for under- 
standing and interpret ing DFDI data . Appendix [B] gives a derivation showing the relation to the 
approach employed by lErskind (120031 ) . to which the approach here is complementary. 



2.1. Formation of a Fringing Spectrum 

Figure [U shows a highly simplified schematic of a DFDI, consisting of the two main components, 
a fiber-fed Michelson interferometer and a disperser, followed by a detector. Light input from the 
fiber is split into two paths along the arms of the interferometer and then recombined at the 
beamsplitter. The output is fed to the disperser, represented for convenience as a prism, though 
generally this will be a spectrograph. An etalon is placed in one of the interferometer arms to 
create a fixed optical pa th difference (or 'delay'), d = dp, between the two a rms, while allowing for 
adequate field widening (JHilliard &: Shepherdlll966l : lMahadevan et al.ll2008al ). do is typically on the 
order of millimeters. In practice, an iodine vapor cell can also be placed in the optical path before 
or after the interferometer to act as a fiducial reference (section 12. 6p . 



Inputting a wide collimated beam of monochromatic light into the instrument with both in- 
terferometer mirrors exactly perpendicular to the light travel path will give either a bright or a 
dark fringe at the output of the interferometer (figure [T]A.), depending on whether the exact path 
difference d between the two arms corresponds to constructive or destructive interference. If we 
were to scan one of the mirrors back and forth, the flux at the interferometer output would vary 
sinusoidally as a function of d. If we now tilt this mirror along the axis in the plane of the page, we 
effectively scan a small range of delays along the y direction (i.e. perpendicular to the axis of the 
tilt and in the plane of the mirror, corresponding to the slit direction in the spectrograph). Hence 
we would see a series of parallel bright and dark fringes, now varying sinusoidally as a function of 



^Another way of sampling the fringes is to scan the interferometer delay in very small steps fsee lErskind 120031 ): 
this allows for certain advantages in calibration as well as a one-dimensional spectrum which requires less detector 
real-estate, but comes at the disadvantage of requiring an actively controlled interferometer. The principles are the 
same, however. 
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Fig. 1. — Dispersed interferometer schematic, y corresponds to position in the sht direction (di- 
rected out of the page in the interferometer schematic) , and A indicates wavelength in the dispersion 
direction. A) Output from interferometer alone with monochromatic light input, and mirror 2 un- 
tilted. B) The same with mirror 2 tilted along the axis in the plane of the page, as shown. C) 
Image on detector with monochromatic light at very high resolution. One fringed emission line is 
seen. D) Detector image with white light input. E) Image with stellar spectrum input. F) As for 
E but at low resolution. 



Consider first a very high (actually infinite) resolution spectrograph disperser for the sake 
of argument: following the beam through until it reaches the detector plane would result in a 
single emission line with fringes along the slit direction, as shown in figure [Tp. Switching the input 
spectrum to white light, which can be thought of as a continuum of neighboring delta functions in 
wavelength (A) space, leads to a similar fringe pattern on the detector at every wavelength channel. 
Due to the fact that, in terms of number of wavelengths, the optical path difference is different for 
different wavelengths, each fringe is slightly offset in phase from its neighbors (and very slightly 
different in period). This gives rise to the series of parallel lines known as the interferometer 
'comb,' shown in figure [Tp. Going further and inputting a stellar spectrum into the instrument 
would simply give the product of the stellar spectrum and the comb, as in figure [TJi]. Finally, 
changing to the real case of a low or medium resolution spectrograph as for an ET-type instrument, 
the comb is no longer (or barely) resolved, and we see a spectrum like that in figure [Tp. Such a 
spectrum is so metimes referred to a s a spectrum " channeled with fringes," also known as Edser- 



Butler fringes ([Edser &: Butler 



1898 



LawsonI l200d : iGd |2002| ) . The remaining fringes contain high 



spatial frequency Doppler in formation that has been heterod yned down to lower spatial frequencies 
by the interferometer comb (lErskindl2003l : lMahadevanll2006l ) . It is this heterodyning that allows for 
the use of a low-resolution spectrograph at low dispersion, and is the key to the DFDI technique. 



2.2. Fringe Phase and Visibility 



Above we outlined a simple intuitive way of understanding the formation of the DFDI fringing 
spectrum. For a full mathematical description, we proceed by a slightly different route. Each wave- 
length channel on the detector has an associated sinusoidal fringe running along the slit direction, 
where by 'channel,' we mean specifically an infinitesimally wide strip of the spectrum along the 
slit direction at pixel position j, where j need not necessarily be an integer. A given fringe has an 
associated phase and visibility, where visibility is a measure of the contrast in the fringe, defined 
as the ratio of the amplitude of the fringe to its central (mean) flux value. Equivalently, this can 
be stated as (/max — - fmin)/(/ma.y + /rn in)i where /max and /mill are the maximum and minimum flux 



value s in the fringe (jMichelsonl Il903l ). Here we introduce the concept of a 'whirl' (JErskine &: Ge 
2000): the phase and visibility for a fringe can together be thought of as representing a vector, with 
the visibility representing the magnitude. These quantities can be determined in a number of ways; 
in general we simply fit a sinusoid. An ensemble of such vectors representing a full spectrum of 
channels is called a whirl. The whirl is the directly measured quantity from a fringing spectrum and 
contains the information relevant to velocity determination. Vector operations such as addition, 
subtraction, and sca lar products can be performed on these whirls just as for the individual vectors 
(JErskine fc Gelboool ). 



To understand what determines the values of the phase and visibility for a fringe, we can 
consider the contribution from each wavelength of light to a particular channel on the detector 
(remembering that although the channel is infinitesimally wide in its spacial extent in the dispersion 



direction, it still has a finite bandwidth). Each contributing wavelength has passed through the 
interferometer, and for an ideal interferometer, will contribute a sinusoid of 100% visibility like that 
in figuredp. The flux of these sinusoids on the detector can each be described by 3f?{l+exp(i27rd/A)}, 
where d varies linearly with position y along the length of the slit, and 3f?{. . .} represents the real 
part of a complex expression. Since the spectrograph has finite resolution, a narrow band of such 
wavelengths will contribute to any given channel, owing to the overlap of line spread functions 
(LSF's) from neighboring wavelengths. The measured fringe along the slit direction is a continuous 
summation of those sinusoids, weighted by the flux of the spectrum contributing to that channel, 
Qj{X), where Q is given by the product of the power spectrum coming into the instrument and the 
spectrograph response function at that channel on the detector. We use the term 'spectrograph 
response function' throughout to refer to the light throughput as a function of wavelength at a 
given infinitesimal point on the detector, or equivalently, at a given channel in the image on the 
detector. (This is distinct from, though closely related to, the LSF - see appendix Rl) 

Switching from wavelength to wavenumber k = 1/A, and dropping the j subscript for simplicity, 
the summation of sinusoids can be expressed as: 



m 



Q(K)K{l + e''"'^''}d«: 



Q{k) dK + ^< / Q{K)e'^'""' dK 



(1) 



where I{d) is the measured flux along the slit direction. The first term on the right hand side 
is simply the total integrated flux in the channel, which must be real valued. The second term 
can immediately be identified as the real part of a Fourier transform, K{J-"[(5](^}, with delay as the 
conjugate variable to wavenumber, and shows the c lose relationship between DFDI instruments 
and Fourier transform spectroscopy (|Jacquinotlll96d ). 



Normalizing by dividing through by the total flux, we can define the complex quantity a such 



that 



where 



/norm(d) = 1 + 9? 



J Q{k) dn 



l + R{a}, 



a 



ae 



J Q{k) dK 



(2) 



(3) 



This is the fun damental equati on for DFDI fringe formation: the quantity a is the 'complex degree 
of coherence' (|Goodmanlll985l ). and describes the phase, (pa, and amplitude, a, of the normalized 
fringes (i.e. the visibility), as a function of d and the input spectrum, a is refer r ed to here as the 
complex visibility^ More rigorous derivations of this can be found in iGoodmanI ( 19851 . ch. 5) and 
LawsonI (|200d ). but this explanation is adequate for our purposes. 



In order to understand the actual form of the fringes seen in a DFDI, it is important to realize 
that the portion of spectrum contributing to any given channel, Q, has a very narrow passband (for 



^The quantity is generally represented by the letter 7 in the literature cited. We use a here instead purely for 
clearer distinction between bold-faced vector and regular-faced amplitude representations. 
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the ET instruments, AA/A ~ 1 A/5000 A = 2 x 10~^). We imagine Q as being a shifted version of 
a function Qq, where Qq has characteristic width Ak and is centered at zero wavenumber. We shift 
Qq in wavenumber so that its center falls at wavenumber k = k, and we have Q{k) = Qo{n — k). 
By the Fourier shift theorem we can write: 

wu = nQoii^ - i^)]d = e-'^'^'^nQoU- (4) 

The right hand side shows two components. The exponential term represents a linear phase vari- 
ation with delay, varying on the scale of the period 1/k. The second term, the Fourier transform, 
represents a modulation of this signal. By the reciprocal scaling property of Fourier transforms, 
the second term can be expected to vary on minimum length scales of the order of the reciprocal of 
the width of Qo, that is, on scales of 1/Ak. Since 1/Ak S> 1/k, equation d] represents a sinusoidal 
fringe of frequency k modulated by a slow variation in both phase and amplitude. To see this more 
clearly, we can substitute equation [5] into the first expression on the right hand side of equation [2] 
and write: 

'---(d) = 1+ ^^^^^^^ . (5) 

«o(d) = ao(d)e''^"oW = /"t^"]'; , (6) 



If we define 



J Q{k) d/t' 



we can rewrite equation [5] as: 



= 1 + ao{d) cos{2TTdE — (j)ao{d)) (7) 

(where we have simplified the negative in the cosine term using the symmetry of the cosine function) . 
This clearly shows the form of the fringe. Over large ranges of d, the fringe appears like a 'carrier 
wave,' given by the cosine ter m, that is slo wly modulated in phase and amplitude by an envelope 



Qq (the 'coherence envelope', lLawsonll2000l ). Over the length of the slit direction on the detector, 
we sample only a very small range of delays, do — Ad/2 < d < dQ + Ad/2, where do is determined 
by the interferometer etalon, as before, and Ad is typically a few wavelengths. Over this range, the 
variation in cxq is small as we show below, so we see only an approximately uniform sinusoid (see 
figure [2]) along a single wavelength channel on the detector. In measuring the phase and visibility 
of this fringe, we essentially make a measurement of Qq at the fixed delay d = do. The phase offset 
of the sinusoid is determined by the argument of ccq ) <Pao ■ The measured (absolute) fringe visibility 
is simply the amplitude of the normalized fringe, oq. 

In general, we can estimate a rough order of magnitude for the fractional change in the magni- 
tude of the visibility between consecutive sinusoid peaks by comparing the variation length scales: 
to order of magnitude, we can expect oq to vary by of order uq on scales of 1/Ak, so that over 
one period of the sinusoid, 1/k, it will vary by Aao = oqAk/k = ao/R, where R is the spectro- 
graph resolution. Since for any input spectrum, 1/Ak determines the fastest variation scale for 
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the envelope, this represents an upper hmit. For the ET instruments, R ~ 5000, so that over the 
length of the slit (a few fringes) Aao/ao ~ 10^^. In practice such a small variation will usually 
be significantly below the measurement errors in fringe phase and visibility due to photon noise 
for even the brightest sources, and would correspond to a final velocity error of ~ 0.1 ms~^ for an 
instrument similar to the KPNO ET|£| Even in the event that it is desired to reach such extremely 
high signal-to-noise ratios (S/Ns), it is in principle a simple matter to fit extra parameters to allow 
for non-uniformity of the sinusoidal fringe, although this has not been attempted with the ET 
instruments. 

In figure [21 the varying amplitude of the modulating coherence envelope, Qq, is illustrated 
explicitly, and we see how measuring the fringe over a narrow range of delays Ad around do gives 
an approximately uniform sinusoid. This corresponds directly to the image seen along the length 
of the slit direction in a given channel on the detector. For illustration the very simple case is 
shown of white light with through a rectangular bandpass with no absorption lines, so that Q (and 
therefore Qq) is a top-hat function. cxQ(d), therefore, is the corresponding Fourier transform, a sine 
function, with zeros at d = n/An (n G Z"*"), which modulates a sinusoid of period 1/k. In practice 
the passband, Ak/k, will be very narrow, so that the variation of cxq will be much slower compared 
to the sinusoid than suggested in the figure, and the sinusoid itself will be highly uniform over Ad 
(i.e. over the length of the slit). 

For a more complicated input spectrum, such as that from a star with its multitude of absorp- 
tion lines, and for a more realistic LSF, the coherence envelope will generally also have a much more 
complicated shape, though the variations will still be slow in d and therefore close to uniform along 
the slit (i.e. within the upper limit discussed above, since the width of the resolution element still 
determines the fastest variation scale). Each channel will have its own unique piece of spectrum 
contributing to it, and therefore each will have its own particular phase and visibility. It is this 
that gives rise to the varied patterns of fringes that are seen in the final fringing stellar spectra 
(e.g. figure [Ip). 

In practice the profile in the slit direction will also be modulated in amplitude by a slit illu- 
mination function, but this can be calibrated out or modeled during the fringe fitting, and has no 
effect on fringe visibility. Though this can present its own practical challenges for data reduction, 
the illumination function is neglected here for simplicity, and taken to be uniform and equal to 
unity. 

As an aside we note that cxq and a are very closely related: from their respective definitions 
in equations [6] and [3l cxo{d) = e^'^'^'^oc{d). The only difference is a phase offset, which, for a given 
channel j at wavenumber kj and fixed delay d = do, is constant ~ that is to say, ao = a and 
(pao = (pa + 27rdoKj. Since the instrument is to be used purely for differential measurements, the 



^Assuming ~ 1000 independent channels, phase- velocity scaling factor F ~ 3300 ms ^ rad ^ (see section [T3)) . and 
using the relationship between phase error and visibility error shown in section [STTl equation 1361 so that the expected 
error is re^j/VTOOO = lO^^r/^TOOO. 
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-'' 1.0 




Fig. 2. — Interferograni showing the coherence envelope due to a rectangular band pass modulating 
the sinusoidal fringe. Along the slit direction of a fringing spectrum, a very small part of the 
interferogram is sampled over the range do it Ad/2. 
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zero point from which phases are measured is somewhat arbitrary and has no physical significance: 
we are concerned with changes in phase over time, which will affect both qq and a in the same 
way. For the analyses presented hereafter, the difference between qq and a is therefore not of great 
significance, and either can equally well be thought of as the complex visibility. However, for the 
sake of consistency, a is generally intended by the term. 



2.3. Prom Phase to Velocity 

To recap, in general, for a given channel j on the detec tor, the complex visibility of the 



measured fringe is given as in equation [3] (or see lGoodmanlll985l . chap. 5). We can rewrite this as: 

(8) 



_ J^[P,,Wf,j]d=do _ ^[PuWuj\t= 



T[PKWKj\d=Q ^[PyW 



uj\t=0 



where a is the complex visibility (or complex degree of coherence) , a vector quantity whose phase 
represents the phase of the measured fringe, and whose magnitude (from to 1) represents the 
absolute visibility of the measured fringe; J-[. . .]... represents a Fourier transform evaluated at 
interferometer path difference d, or time delay r, where d = ct and c is the speed of light; P is the 
input spectrum; and Wj is the response function for that particular channel on the detector, so that 
the spectrum contributing to the channel is given by Qj = Pwj as before. We take d to be fixed 
at a value do (for the purposes of the calculations here, the small difference in d across the length 
of a sinusoidal fringe is of no consequence) . Subscripts are added to explicitly indicate functions of 
wavenumber, n, or optical frequency, v = en: we note that the equation is completely equivalent in 
A€ space with d as the conjugate variable, or in zv space with r as the conjugate variable. In general 
the form being used will be implicit from the context, so we drop these subscripts. We have also 
replaced the integral over the flux in the denominator with the Fourier transform at zero delay, 
which is mathematically equivalent (this fact is made use of a number of times later on in this 
analysis). All the necessary mathematics for determining Doppler shifts and for dealing with the 
combination of the star and fiducial reference spectra (see section 12. 6p derive from this formula. 

The key to the DFDI RV technique is the fact that Doppler shifts of the spectrum result in 
directly proporti onate phase sh ifts of the fringes. This is a direct consequence of the Fourier shift 



theorem (see e.g. lErskindl2003l ). If the spectrum shifts such that P{n) — )• P'{k) = P[k + Ak), and 
we correctly follow the shift in the dispersion direction (so that we now compare to the wavelength 
channel corresponding to Wj+aj = Wj{k + Ak) - assuming that the spectrograph response function 
maintains the same form in nearby channels, and noting that Aj is not necessarily an integer), 
then the shift theorem gives: 

^/ ^ J^[P(k + Ak)wj{k + AK)]d=do ^ ^i2nAKdo J^[PiK)Wj{K)]d=do ^ ^i2nAKdo^ /gN 

T[P{k + Ak)wj{k + AK)]d=o T[P{K)wj{K)]d=o ■ ^ ' 

In other words, we have a phase shift of Ac/) = iTrdoAn. By comparing the measured phase of the 
new fringes a' with the previously unshifted ones, a, it is thus possible, in this simple case where 



14 



there is no superposed reference spectrum and the instrument is perfectly stable, to derive the 
Doppler shift without any explicit knowledge of the underlying high resolution spectrum, or of the 
spectrograph LSF. Using the Doppler shift equation Ak/k ~ —Av/c, where v represents velocity, 
conventionally positive in the direction away from the observer, we can write: 

A0 = 27r(ioAK = = -^Av = -— , (10) 

c cA r 

where, F, the phase-velocity scaling factor which gives the proportionality between phase shift and 
velocity shift, is defined as: 



r. "^^ 



lixdn 



(11) 



By combining the many measurements of the phase shift Ac/) from each channel, j, (allowing, if 
necessary, for the wavelength dependence of F), a very high precision measurement of the differential 
Doppler velocity shift, Au, can be made. 



2.4. The Interferometer Comb 

The interferometer comb, mentioned in figure [1] and the corresponding text, is really just a 
special case of the discussion in section 12.21 where the input spectrum to the instrument is purely 
white light continuum. In that case Q, the product of the input spectrum and the spectrograph 
response function, is itself equal to the spectrograph response function. The comb is therefore 
purely a consequence of the response function, arising naturally from equation [3l In fact, the 
example used of the top-hat function for Q is a reasonable first approximation for the LSF, and so 
also for the response function (see appendix |Aj) , for a spectrograph where the slit width dominates 
the resolution. The interferogram in figure [2] is thus a reasonable representation of the behavior of 
the interferometer comb at finite resolution. 

We can see from this that by appropriately choosing the delay and spectrograph slit width we 
can null out the interferometer comb by finding a minimum in the envelope. Early experiments 
changing the slit width and delay with ET prototypes did indeed show this kind of sine-like variation 
in the comb visibility. This becomes important when using a superimposed reference spectrum, as 
in section 12.6.11 

It is also instructive to consider an idealized infinite resolution spectrograph. In this case, the 
response function, Wj, becomes a delta function, so that Qj is also a delta function for all channels 
J. By equation [6l given that Qq is the delta function shifted to d = 0, the coherence envelope, 
Q!o('^)) is the normalized Fourier transform of this delta function: Q;o(d) = 1 at all delays. Equation 
[3 then gives the very simple form of the resulting interferogram: 

-^norm = 1 + COs{2lTdK) (12) 

where we have stopped representing k as a mean value since the width of the channel is negligible. 
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This 100% visibility 'infinite resolution' comb is the underlying form for any DFDI comb. Low- 
ering the resolution will reduce the visibility from 100% at the given fixed delay, as in the example 
of figure [21 with perhaps an overall phase offset depending on the symmetry of the spectrograph 
response function (and uniform to the extent that the response function and LSF are uniform across 
all channels). 

The infinite-resolution comb can also be thought of as a an interferometer transmission func- 
tion. In introducing the instrument (section 12. ip . we first described the formation of the DFDI 
spectrum as a multiplication of the stellar spectrum and the infinite-resolution interferometer comb 
(i.e. interferometer transmission function) , convolved wit h the LSF down t o the spectrograph res- 



olution. This is the approach adopted by lErskind (|2003l ) and iMahadevanI (J2006l ). and both views 



are entirely equivalent. Following the Fourier transform approach outlined here, however, we can 
proceed somewhat further, and obtain some important insights in understanding systematic errors 
from the use of a simultaneous fiducial reference spectrum (section 12. 6p . In principle the Fourier 
transform approach can also be used to create simulated DFDI spectra without having to assume 
a uniform LSF at all wavelengths, which is difficult to do in the alternative approach. A derivation 
relating the two methods is outlined in appendix [Bl 

To show visually how the comb forms, it is depicted schematically in figure [3l plotting contours 
of flux from equation [12] as a function of wavelength A = 1/k and delay d. Since A maps linearly 
to X position on the detector and delay maps linearly to y position along the slit (at least for an 
ideal spectrograph and interferometer), this also represents the image that would be seen on the 
detector if the full ranges could be sampled down to zero wavelength and zero delay. The box in the 
figure schematically represents the segment of the interferogram that we actually observe with the 
instrument: a series of tilted parallel fringes (as shown in figure [Ip), with a very slow wavelength 
dependency. For clarity, the figure is not to scale: in practice, the delay is fixed to a much larger 
value so that the fringes are observed at much higher order, n, and the wavelengths observed are 
much longer, so that any real observed comb is much denser and more uniform, and the variations 
with wavelength much smaller. 



2.5. Calculating the Interferometer Delay 

The interferometer delay, do, is determined by the etalon in the interferometer, and must be 
known precisely in order to be able to accurately translate from phase measurements to velocity 
measurements. The best precision that can be obtained in RV measurements is a trade-off between 
maximizing the phase-velocity scale F (so that a large phase shift results from a small change in 
velocity) and maximizing the visibility of the fringes (since higher visibility means more accurate 
measurements of the fringe phases). Since the visibility of the fringes is determined by the match 
between do and the typical spectral line widths to be observed, an optimal value of d p can be chosen 



to give the best precision for the expected typical targets for the survey (JGd l2002l ) . This is set at 
design time, and remains fixed for the instrument. 
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Increasing wavelength 



Fig. 3. — Simulated interferometer comb, as a function of wavelength (corresponding with dispersion 
direction on detector) and delay (corresponding with slit direction). Setting a large interferometer 
delay and choosing the wavelength range over which the spectrum is observed selects a 'window' in 
the comb (shown schematically) where the fringes are approximately parallel. The orders of some 
of the fringes, n, are shown down the right-hand side. In practice, the 'window' chosen is at much 
longer wavelength and much higher order. 
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Annual variations in the RV of a star can be as large as ~ 60kms~^ even for an RV-stable 
target, owing simply to the orbital motion of the Earth around the Sun (which dominates signif- 
icantly over the Earth's rotation). If we are to consider approaching precisions on the order of 
lms~^ we therefore need to know F to better than one part in 60,000. Since T depends directly 
on the interferometer delay (equation [TTj) , determining T is synonymous with measuring the delay. 

To a first approximation, the delay can be calculated from the properties of the delay in the 
interferometer. For example, for a monolithic interfero meter with arm lengths Li and L2 and 



refractive indices ni and n2 respectively, this is given by (JMahadevan et al.ll2008al ): 



do = 2(niLi -n2L2), (13) 

This depends on the assumption that there is negligible dispersion in the etalon glass, i.e., 
that rii and n2 are close to independent of wavelength over the wavelength range of interest. 
Dispersion can in fact b e a significant effect, but the assumption should be good to a few percent 



(JBarker &: Schuleiill974l : D. J. Erskine 2001, private communication), enough for an initial estimate. 
Accounting fully for the dispersion and allowing do to become a function of wavelength, however, 
is essential where very high velocity precision is required from large bandwidth observations. 

A more precise measure of the delay can be determined simply by counting fringes in the 
interferometer comb. We know from equation 1121 that the phase of the comb varies as <j) = IttcLk = 
2TTd/\. Although this equation is for a comb at infinite resolution, the same variation will hold 
true at lower resolutions: a spectrograph response function broader than a delta function will only 
reduce the visibility of the interferogram, and possibly add an overall phase offset to the entire 
interferogram (provided that the shape of the response function is uniform across the detector). 
Differentiating with respect to wavelength: 

d(t) _ d{2Tm) _ 2nd 

dx ~ ~dx~ ~ "^' ^ ^ 

where n = (P/2tt is the fringe order, giving: 

In other words, by counting the fringe density dn/dX over wavelength, we can immediately calculate 
do, and hence F. Since there is a A~^ dependence in dn/dX itself, care needs to be taken to account 
for the dependence properly when determining the fringe density at a given wavelength. This 
may more easily be done in wavenumber space instead, since the fringe density is uniform with 
wavenumber, and d = dn/dn. 

In practice, counting fringes is often not easy, since the comb is often barely resolved (usually 
by design). As long as the comb is not under-sampled on the detector, this can be overcome by 
temporarily using a narrower slit in the spectrograph, since in principle the delay should only need 
to be determined once. Even so, it is usually possible in practice only to count over a range of a 
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few hundreds to one or two thousand fringes. Counting along one row in the dispersion direction of 
the comb therefore gives an accuracy on the order of one part in 1000. Over a 60kms~^ variation, 
this is still only good to the 60ms~^ level. Our method of choice in the past has been simply 
to observe known stable reference stars over the time baseline of interest and use their known 
apparent changes in velocity due to the Earth's motion to calibrate F. Provided the reference 
stars are genuinely stable, and they are positioned in the sky such that their barycentric motions 
are large, this technique will provide an accuracy in the determination of T at least equal to the 
intrinsic RV stability of the stars. 

Other methods are under investigation which should allow more precise measurement of the 
delay. By averaging fringe counts over many rows of a wide spectrum, and further averaging over 
many frames, it may be possible to achieve significantly sub-fringe counting accuracy (J. Wang 
et al., 2010, in preparation). Other techniques in development using a separate device to directly 
measure the interferometer delay should provide a robust direct measurement that obviates the 
need for more laborious empirical delay determination (X. Wan et al., 2010, in preparation). 



2.6. Handling a Fiducial Reference Spectrum 

2.6.1. Multiplied Reference 

The extremely high sensitivity of the instrument means that numerous instrumental effects 
can masquerade as velocity shifts. Tiny changes in the interferometer delay due to thermal flexure, 
for example, will appear as phase shifts in the fringe pattern. The image itself can also shift as a 
whole on the detector in both the slit and the dispersion directions. 

One way of accounting for these instrumental artifacts is to use a fiducial spectrum from some 
known zero-velocity reference. The simplest way to do this is to bracket the science data, either 
spatially, running the fiducial spectra along a separate optical path alongside the target spectrum; or 
temporally, alternating target exposures and reference spectrum exposures along the same optical 
path. Since the reference spectrum is stationary with respect to the instrument, it will track 
instrument shifts, which can then be subtracted from the measured stellar shift to reveal the star's 
intrinsic motion. (Note that from equation (TOl a change in do conveniently has mathematically 
exactly the same effect as a change in velocity, Av.) These approaches, however, potentially suffer 
from errors due to their separation from the science data: in the first case, because of non-common 
path errors due to imperfect optics, and in the second case, because the fiducial exposures are not 
tracking instrument drift contemporaneously with the data. 

An alternative approach is to insert an absorption reference into the optical path - in the 
case of the ET instruments in the past, a glass cell filled with iodine vapor maintained at a fixed 
temperature, the traditional reference of choice for RV planet searches. In this way the reference 
spectrum is multiplied with the stellar spectrum. To do this, for each target to be observed, two 
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fringing 'template' spectra are taken, one being pure star with no reference in the beam path, and 
the other pure reference (for ET, a pure iodine spectrum taken by shining a tungsten continuum 
lamp through the cell). These templates are then used to separate out the stellar and reference 
components of the combined star /reference data (referred to here as 'data' or 'measurement' frames, 
as distinct from 'template' frames). A formalism is required to extract the reference and stellar 
spectra from the combined spectrum. In order to proceed, we define the following symbols: 

• j — as before, the pixel number in the dispersion direction which identifies the column along 
which a fringe is measured in the slit direction, corresponding to a single channel. Strictly 
speaking, the channel is infinitesimally wide on the detector, so that j need not necessarily be 
an integer. Since the spectrum is oversampled, however, it is often a reasonable simplification 
to think of the entire pixel column representing an infinitesimal sample in the dispersion 
direction (see appendix R|) . 

• M(j) — the complex visibility vector (i.e. phase and absolute visibility) for a fringe at channel 
j in a single Doppler measurement frame of combined star/reference data, an ensemble of 
such values for a spectrum across all j comprising a 'whirl.' 

• S(j) — the measured complex visibility for the star template at channel j. 

• I(j) — the measured complex visibility for the reference template at channel j. 

• Ai{X) = Cni(A)M(A) — the input spectrum for a combined star/reference data frame, where 
Cm represents a normalization, such as the continuum function, and M is the normalized 
spectral density. Cm is assumed constant to a good approximation over the scale of the width 
of the response function w (see below) and instrument LSF, and < M < 1. 

• 5(A) = Cs(A)5(A) — the same for the star template spectrum. 

• X(A) = Ci(A)/(A) — the same for the reference template spectrum. 

• s(A), i(A) — such that S = 1 — s, I = 1 — i; < (s, i) < 1. 

• w{j,X) — the response function at position j on the detector, i.e., the spectrum that con- 
tributes to an infinitesimally wide channel at the detector plane if perfect continuum light is 
passed through the instrument. (Note that w is very closely related to the instrument LSF 
— see appendix Rl) 

• d — the interferometer delay, fixed to a value of d = do, as usual. 

• r — phase/velocity scaling constant, also as before. 

• J^[. . .]ii — as before, Fourier transform evaluated at interferometer path difference d. 

• rr^lrf — shorter notation for Fourier transform, for convenience. 
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• [. . . (g) . . .]|ci — used to denote convolution, evaluated at a delay of d. 

We assume for now the case where there is neither intrinsic Doppler shift nor any instrument 
shift in either phase or in the dispersion direction, for both star and reference components. We also 
assume no photon shot noise. Here the aim is simply to reconstruct the data whirl from the two 
template whirls. Once this is achieved, it is conceptually a relatively trivial step to allow for shifted 
and noisy data: the template whirls need only to be shifted iteratively in phase and translated in 
the dispersion direction until a best-fit solution is found, allowing the intrinsic stellar Doppler shift 
to be directly calculated. This can be done using any standard least-squares method. 

Following equation [HI the complex visibility measured at detector channel j for the two tem- 
plates, S and I, and the combined star /reference data, M, can be written exactly as: 



T[Sw] 



do 



T[Sw]o 
H^w]do 



[S0 


^]|do 


[S® 


^]|o 


[X01 


^]Uo 






(16) 
(17) 



The key lies in expressing equation [18] in terms of [16] and [T7] This is made difficult by the 
convolutions, which appear to require knowledge of the template spectra at all possible values of 
the delay d in order to be evaluated. The nature of the DFDI is such that we measure it only at 
one value, do- An approximation can be used to address this problem, which is described in section 

mm 

It is possible to rewrite the input spectrum as: 

M = AST 

= ACsCiSI = C'SI 

= C'{l-s)il-i) 

= C'{1- s + 1-i-l + si) 

= C'{S + I-l + si), (19) 

where A is a scaling constant to allow for difference in total flux level between the templates and 
data, and C = ACgCi is a constant over the width of the response function. If we assume either s or 
i or both <C 1, then the 'crosstalk' term, si, can be neglected. Since i and s essentially represent line 
depths, this means that we are assuming either very shallow lines, or no significant overlap between 
lines in the two different spectra. Keeping the crosstalk term in place for now for completeness. 
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however, we can continue, substituting equation [19] in the first expression on the right hand side of 
equation [18) 

^ ^ T[Mw]do ^ [Sw + Iw - w + siw]\do ^20) 

The factor C" has canceled because it is constant over the width of the response function, and 
therefore can be taken outside the Fourier transforms. The denominator of this equation represents 
a normahzation, corresponding to the total flux in channel j on the detector. The term wldg in the 
numerator is due to the interferometer comb, since if white light is passed through the instrument, 
then S = I = 1, and the cross talk term vanishes. We are then left with: 

Mcontinuum = w\do/w\Q, (21) 

which describes the interferometer comb. As expected, the properties of the comb are determined 
purely by the response function, as discussed in section 12.41 There the comb was described first 
for a delta- function response function, and then for a top hat; the equation here represents the 
generalization to any shape of response function. 

Rewriting the first expression on the right hand side of equations [16] and [T7] in terms of S and 
/ and substituting into equation [201 we can write: 

M = K^S + Ka + _ -^.o+^Uo_ (22) 

Swlo + Iw\q — wlo + siw\o 

where the scalar quantities Kg and K[ are given by: 

K. . ^, ^,^'° _, , K, = ^, ^^^\ _, . (23) 

Sw\o + Iw\o — wlo + siw\o Sw\o + Iw\q — wlo + siw\o 

Hence we see that we can now represent the combined star /reference data in terms of a linear 
combination of the measured star and reference templates, along with an error term. 

The fraction on the right in equation [22] contains two terms in the numerator, the comb term, 
w\dQ, and a cross talk term, siw\dQ- It is in principle possible to arrange the instrument such 
that at delay d = do the interferometer comb has zero visibility, by choosing the delay and slit 
width so that Mcontinuum IS at a zero point of w (see section [2^ . Alternatively, it is possible to 
low-pass Fourier filter the data image before measuring the whirls, essentially simulating a lower 
spectrograph resolution. In either case, we assume that w\dQ -^ 0. If we now also neglect all the 
cross talk terms si following from equation [T9l we finally have the whirl addition approximation, 
which we can write as 



M ^ KsS + KJ. (24) 



Kg and Ki represent scaling factors in the absolute visibilities of the two templates. In the case that 
we take our normalization functions (Cm, Cg, and Ci) to be continuum normalization functions, then 
remembering that the evaluation of a Fourier transform at d = represents the total integrated 
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area under the function, we can try to gain a handle on the expected sizes of these scahng factors. 
To the extent that the total area under Sw and Iw is not much less than that under w (i.e. that 
the area in discrete absorption lines is small, or J^^-sdA <C 1 and J/^^id^ 'C 1, where Aw is 
a representative width of the response function), equation 1231 implies that K^,K\ ~ 1. This can 
easily be seen by rewriting in terms of s and i alone: we can then assume all the terms sw\q, iu)|o, 
siw\Q <C 1 — the last because both s and i are everywhere less than one by definition and so si 
must always be even smaller than either — and we find we are then left with K^ ~ 'u;|o/'u)|o = 1, 
and likewise for K\. 

As far as the addition approximation holds good, and to the extent that K^ and Ki are 
approximately constant across all channels j, it is then a simple matter to allow for Doppler and 
instrument drift by allowing the template whirls to rotate in phase and translate in the dispersion 
direction as a function of j; allowing K^ and K\ to vary as free parameters as well, we can minimize 
X^ in the residuals to find the best fit solution compared to the measured data M for the complete 
ensemble of wavelength channels. The difference between the phase rotation of the star and that 
of the iodine (remembering to account for wavelength dependence as necessary) yields the intrinsic 
differential stellar Doppler shift, while the shifts in the dispersion direction allow for Doppler shift 
of the stellar lines and any instrumental image drift on the detector. 

By these definitions, however, there is in fact little reason to assume that K^ and K\ should be 
constant from channel to channel. Furthermore, an iodine cell reference typically absorbs a total 
of ~ 40% of the incident light, so that the assumption of small area within the absorption lines is 
not necessarily robust across the whole spectrum. Inspecting the terms in a little more detail, we 
can recast them, rewriting equation [23] as: 

^ ^ Sw\o ^ T[{S/Cs)w\q ^ Cm ^^;|o ^25) 

Majo\o T[{M/Cr^)w]o C^ Mw\^' 

and likewise for Kj so that we have: 

(26) 

The terms are now written in terms of measurable quantities, namely the total fiuxes in each 
channel j for the templates and the data. We also see that they are dependent on the definition of 
the functions Cm,Cs, and C\. Continuum normalization functions could be determined by simply 
fitting a smooth continuum function to the measured fiuxes. There is, however, nothing in the 
preceding analysis that requires that Cm,Cs, and Cj be continuum functions. Defining them as 
such allows for an intuitive approach to visualize the effect of absorption lines, but they can in fact 
be any function, subject only to our requirement that the fractional deviations of the spectra from 
these functions (as represented by s and i) remain small, so that the cross-talk term also remains 
small. It is arguably more appropriate to define the functions to represent the mean flux across each 
of their respective wavelength channels: in this case we see that K^^ and K^ simplify immediately 
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to exactly unity, independent of wavelength channel, so that they drop out of equations [22] and 
[2^ The difference is absorbed in the cross talk-term through its dependence on s and i, which in 
turn are also dependent on Cg and Ci respectively. Written in this way, the whole of the addition 
approximation error is included in the single cross-talk term, siw\dQ, in equation 1221 

We now have an approximate formalism for solving for stellar Doppler shifts from combined 
star/reference data, where the reference spectrum multiplies the stellar spectrum. The above 
analysis is only useful, however, in as far as the approximation that the cross talk, si, is very 
small holds well. It appears, however, that as it stands, this approximation is in fact not accurate 
enough for exoplanet searches. In section [3.3.31 we derive an estimate of the errors resulting from 
the approximation, and find that systematics as large as 50ms~^ or more can arise. Clearly this 
cannot be neglected. Approaches to correcting or avoiding the error are discussed in section [4) 



2.6.2. An Alternative: Combined-beam Reference 

One possible solution to the problem of the addition approximation is to actually physically 
superpose a reference spectrum on top of the stellar target spectrum, for example by splicing two 
input fibers into one, one coming from the telescope and one from the reference lamp. In this case, 
the two spectra now combine additively instead of multiplicatively. We can then write: 



M= AgS + Ail 



(27) 



where Ag and A[ are scaling factors to allow for flux differences between the templates and data 
(note that two such factors are now required). Once again, following equation [8] we can now write: 



M 



J='[Mw]do 

T[{A,S + Ail)w]d, 
J'iiA.S + AiI)w]o 

AsSw\do + Ajlwldo 
AsSw\o + Ailwlo 



or alternatively, 
where we define: 



M = K',S + Kll, 



Ki 



ASw\o 



AsSw\o + AiIw\o 



K'^ 



AiXw\i 



A^Sw\q + A\Xw\q 



or: 




(28) 

(29) 
(30) 

(31) 
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We see that we now have an exact expression for M, with the difference being that we now need to 
take into account the flux scahng factors ^s and A[, where previously the flux scaling factor had 
canceled. 

There is also a constraint on the visibility scaling constants K'^ and Ki.. Since the Fourier 
transforms at d = represent total fluxes within the channel, flux conservation means that we can 
write: 

AsSw\o + Aifw\o = Xiw\o. (32) 

Dividing through by the flux in the combined star/iodine data, A4w\o, and substituting the visibility 
scaling constants, we find: 

K'i+K'^ = l (33) 

As before, we can solve for phase rotation and dispersion shift by x^ minimization, this time 
additionally solving for the two flux scaling constants. It is interesting to note that if we multiply 
through both sides of equation [29] by the denominator, Aiw\Q (which represents the total flux along 
the channel in the combined data) , we essentially find we have an expression which is a summation 
of flux X visibility terms. Since visibility is defined as (/max — /min)/(/max + Imin), where /max 
and /mill are the maximum and minimum fringe intensities, then multiplying by total flux in the 
channel gives a quantity equal to the amplitude of the fringe. Hence equation [29] is really simply 
summing fringe amplitudes, and is exactly what we expect when the two input spectra are combined 
additively: the resulting image on the detector should simply be a direct flux summation of the 
respective images that would be obtained individually. 



3. Sources of Error 

Here we provide derivations of some useful formulae for estimating the errors from certain 
sources for which we have been able to find analytical approaches. These include photon errors; 
additive spectral contamination errors, such as moonlight background, crowded targets, etc.; and 
multiplicative fringe-visibility contamination errors, which include in particular the cross-talk error 
due to the whirl addition approximation for in-beam absorption reference sources, but which can 
also be applied to other effects such as residual interferometer comb (again, in the case of an 
in-beam reference). The latter formulae are potentially applicable to a number of different error 
sources, and all are likely to be useful for any implementation of a DFDI instrument. 

Since this is primarily a theory paper, we do not attempt to provide a comprehensive ac- 
counting of error sources: many are instrument implementation specific, or data reduction pipeline 
specific, and better suited to empirical or semi-empirical assessment through simulations and ex- 
perimentation. Such work is still ongoing with the ET project. For more complete discussion of 
specific errors in the ET project, we point the reader to upcoming MARVELS publications on the 
instrument (J. Ge et al. 2010, in preparation) and pipeline (B. Lee et al. 2010, in preparation); 
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mor e detailed d i scussi ons of errors from earlier ET work can also be found in Ivan EvkenI (J2007l ) 
and iMahadevanl (J2006l ) . Table [1] provides a summary of the examples of applications of the error 
formulae provided in the text. 



3.1. Photon Errors 



The errors due to photon shot noise provide an important baseline for any instrument. They 
indicate the absolute limit to the precision that can be achieved, and drive throughput and (for 
DFDI instruments) fringe visibility considerations for the optical design. It is entirely reasonable 
to conceive of a photon-limited DFDI-type instrument. However, even in cases where photon 
noise is dominated by other effects in the very high precision regime, photon noise inevitably 
becomes significant at the faint end of the stellar target sample. For the MARVELS/Keck ET, 
geared toward moderate precision surveys of fainter targets, although other errors dominate the 
instrument requirements error budget at the brightest {V ^ 8 mag) end of the target range, photon 
noise becomes a signific ant part of the e rror at fainter levels (down to ^ = 12 mag, ~ 21.5 ms~^ of 



see 



Ge et al.l (|2009l )). In the high precision, high-flux regime, (e.g., a planned 



a total 35.0 ms^^ 

lms~^-level cross-dispersed DFDI upgrade for the KPNO ET), the photon error is also important 

as it indicates the level below which other sources of systematic and random error must be driven. 

The photon error in the phase measurem ent (and hence velocity measurement) from a single 
channel can be estimated following iGd (|2002l ). This gives essentially 



1 



cX 



£v,_ 



(34) 



where e^j is the error in velocity due to channel j alone, c is the speed of light, A is the wavelength, 
d is the optical delay, aj is the visibility of the fringe, Fj is the total flux in the channel, and F is the 
usual phase- velocity scaling factor (equation \TT\ ignoring the negative sign since we are interested 
only in the magnitude) |j The terms following the F represent the error in phase due to the photon 
noise, Etpj = ^/2/{aj^/Fj). Following a similar derivation, it is straightforward to show that the 
error in visibility due to photon noise, £a,j, is given by 



-a,J 




(35) 



and hence, assuming independent errors, there is a useful simple relationship between the errors in 
phase and visibility: 



^The small difTerence in the numerical factor in the denominator of equation [34] (7r-y2 versus 4) is due to using the 
rms slope of the fringe, rather than the mean absolute slope used in iGa (|2002i V Monte Carlo simulations of sinusoid 
fits suggest that the rms slope gives more accurate results. 
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Table 1. Summary of Example Error Magnitudes 



Noise Source 


Subsection 


Approx 


. Magnitude 






(' 


ms^i) 










Photon shot noise - multiplied ref.^ 


13.1.11 


3.2 


Photon shot noise - added ref.^ 


13.1.21 




3.6 


Photon shot noise - separate ref.^ 


13.1.31 




2.9 


Moonlight contamination 


13.2.21 




<41 


Residual interferometer comb 


13.3.21 




9 


Addition approximation 


13.3.31 
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Note. — Error magnitudes as calculated in the text are listed: these are 
examples for illustration only, and each is highly variable and dependent on 
specific circumstances. See the text for assumptions made in each case. 

^Assuming iodine reference - see text for improvements using ThAr in the 
added-reference case. 

Applies only for multiplied reference spectrum. 
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As a general rule, we can see from equation [M] that precision goes with the inverse root of flux, 
as one would expect, and also as the inverse of visibility: higher flux and/or higher visibility mean 
better precision. From this formula we can derive the photon errors in the final differential RV for 
different calibration scenarios. 

For simplicity in the following formulae, we take A to be constant, taking the wavelength value 
at the center of the spectrum, since it varies by only ~ 10% from one end of the spectrum to the 
other in the current ET instruments. For an instrument with a very large bandwidth, however, it 
may be necessary to consider it properly as a function of channel, Xj. This simply means it cannot 
be taken outside the brackets as in the following derivations, but otherwise the formalism is the 
same. 



3.1.1. Photon Error for Multiplied Reference 

To calculate the expected error in an RV measurement for a single data frame, assuming an 
instrument configuration where an iodine or other reference spectrum multiplies the input stellar 
spectrum, we consider the resulting data spectrum as consisting of two components, a star compo- 
nent, and an iodine component. The calculated phase shift due to intrinsic target Doppler shift, 
A(f) is given by: 

A<t) = ((/)smj - <t>st,j) - {(pimj - <t>itj), (37) 

where (...) here represents a weighted mean over all j, (/>sm,j and 4>im,j represent the phases for the 
star and iodine components of the combined star/iodine data ('measurement') frame, and (f)st,j and 
(jjit.j are the phases measured in the separate pure star and iodine templates. For convenience, we 
immediately map these phases to corresponding 'velocity' measurements by multiplying both sides 
by r to give a velocity shift, Au (though with the caveat that a velocity measurement of a single 
channel in a single spectrum has no physical meaning in itself until it is differenced with another 
spectrum): 

Aw = (fsm,i - fstj) - {Vira,j - i^it.j), (38) 

Using £v with corresponding subscripts to represent the various errors in this equation, we can 
expect a total photon error in Av to be given by: 

2 



el 



^j ( V^''''^™'-?' ~^^^.st,i 



+ 



^j [y^v,im,j '^^v,it,j 



(39) 



where Ej (a) represents the standard statistical error in a weighted mean: 

EjicTj) ^ ^^==. (40) 

In practice, the two template terms in equation [39] are neglected, for two reasons. The first 
is simply because in general the templates will have significantly higher fiux than the data frame: 
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the iodine template can be taken with arbitrarily high flux since it is obtained with a quartz lamp 
as a source; and the stellar template is usually deliberately taken with higher flux than the data so 
that it does not compromise the entire data set. The second reason is a little more subtle. All RV 
measurements with this kind of instrument are differential, measured relative to the two templates 
which effectively set the zero point of the measurements for the star and iodine, as seen in equation 
[38l Since this 'zero point' is the same for every RV measurement, any error in the zero point will 
not contribute to the rms scatter in a set of measurements which uses the same templates. 

This last statement holds true to a point: accuracy in the templates is still needed in order 
to disentangle the stellar and iodine components of the combined data. From simulations of ET 
fringing spectra, we find, for example, that for a multiplied iodine reference, using a GO or G2V 
stellar template in place of a G8V template yields an rms error of llms~^ over large (60kms^^) 
differential velocity shifts. (Depending on the precision required, this points toward the interesting 
possibility of using templates of different stars from the target star: this could allow, for example, 
for higher S/N templates when observing very faint targets, or perhaps for disentangling the signals 
from double-lined spectroscopic binaries.) 

Since photon errors go as l/^/Ru^x., the remaining terms, Ej{ev,sra) and Ej{ey^ira), can be 
estimated by scaling the respective template errors (which, unlike the measurement component 
errors, can be determined directly from equation [34l) by the flux difference between the templates 
and data, giving: 

^v — [-^jy^v.sm.j )l ~r [-t!jj\£v,iui,j )l 

Fit 



=— [Ej{e^^st,j)] + ^— [Ej{£v,it,j)] 



-'III -f m 



(41) 
(42) 



where F^t, Fit, and F^ represent the mean fluxes across the whole star template, iodine template 
and data frame respectively. Explicitly substituting equation [3l] into equation H2| we find: 




Ei 



1 



Ost.jVi^st,. 



-^ m 



E, 



1 



aitjVi^it,, 



(43) 



where the error combination function, Ej, is given by equation 

Hence we have a quadrature summation of the photon errors due to the star and reference 
components of the combined star/iodine data, each being the weighted expected error in velocity 
across the respective template spectra scaled to the flux level of the data. As one would expect, 
the error goes with the inverse root of the mean flux in the data spectrum, (-Fm) ; the error in 
each of the two components will also scale as the inverse of the visibility in the respective fringing 
spectra. Written in this form, the E(. . .) terms need only be calculated once, representing photon 
errors for each template: they then can be conveniently scaled and combined to give the error in 
each data frame for the source. 

We note that these formulae for the photon limit are for the values expected given the fringe 
visibility that was obtained. Various instrument effects - for example defocus, or a non-optimal 
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delay for the stellar line width - can reduce the visibility from its optimum and hence reduce this 
photon limiting precision. 

This has been the formalism employed in calculating the photon error for the Kitt Peak single- 
object ET for operations with an in-beam iodine cell. As an example, an observation taken at very 
high flux with the KPNO 2.1m ET run in May 2007 of 36 UMa (stable, V = 4.84 mag, lOmin 
exposure) gives mean signal/noise ratios (S/N) per pixel for star template, iodine template, and 
data frame of 222, 146 and 179 respectively. These values give photon errors for the star and 
iodine components of 2.8 and 1.5 ms^^ respectively, which when added in quadrature give a total 
photon error of 3.2 ms^^. The KPNO instrument design is such that both output beams from the 
Michelson interferometer are recovered, and this result is for only one of the two beams. Averaging 
over the two beams therefore in fact gives a further improvement of l/-v/2 in photon precision; 
for simplicity, and for comparison with the following sections, we consider only one beam here, 
however. It is interesting to note that the error due to the iodine reference is in fact comparable 
in magnitude to that due to the star, since the signal in the iodine component of the data frame is 
intrinsically limited by the magnitude of the target being observed. Figure S] shows a comparison 
of the actual rms (on the very short term) with the calculated photon errors using this formalism, 
obtained with the KPNO ET on the bright stable star 36 UMa over a total of ~ 2hrs, showing 
good agreement. The preceding example calculation is based on a data point at the high-flux end 
of this data set. (For the purposes of the plot, the two interferometer output beams are averaged.) 

These calculations assume that the flux ratio terms remain the same from channel to channel, 
so that an overall mean scaling can be applied. This is not strictly accurate (e.g., if line depths 
are very deep and broad, or the pure star and pure iodine continuum functions are very different), 
but is taken to work to a reasonable approximation, and seems to correspond quite well with real 
results. In the event that a more accurate calculation is needed, however, it is a simple enough 
matter to introduce channel-dependent flux ratios for each element j within the summations. 



3.1.2. Photon Error for Added Reference Spectrum 

In the case of the reference spectrum being combined additively, rather than multiplicatively, 
the photon errors must be calculated differently. However, we can follow a somewhat similar 
approach. Again, we consider the errors due to star and iodine components of the combined 
star/iodine data, and neglect the errors due to the templates, so that, as for equation 1411 

el = [^,(e,,smj)]' + [^i(e.,im,i)]'. (44) 

where Ej is again defined as in equation HOl The individual components e^.smj' and £v,ira.j must 
be reevaluated, however, since the photon noise from the two separate sources will now combine 
additively (for example, if one of the sources is considerably brighter than the second, its photon 
noise will dominate over the signal in the second) . We can think of an effective visibility for the two 
components in the combined data, Osmj' and aimj- Remembering that fringe amplitude is given 
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RMS vs. S/N, 36 UMa, May 2 2007, KPNO ET 
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Fig. 4. — Measured rms vs. S/N per pixel for the bright stable star 36 UMa over a total of 
approximately 2hr. Obtained with the KPNO ET on May 2, 2007, with varying exposure length 
to achieve different S/N levels, (approx. 5-6 data points per S/N level). Diamonds indicate the 
rms, with error bars corresponding to the uncertainty due to the number of data points over which 
the rms is calculated. Crosses and line indicate the corresponding calculated photon limit. 
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by the product of the visibility and the mean flux in the fringe, we can write 



a 



smj -^mj 



Oist,jAFst,j ; aimji^mj = ait,jAFit,j, (45) 

where astj and aitj are the fringe visibihties for channel j in the star and iodine templates respec- 
tively; and Fstj, F^j, and F^^j are the mean fluxes across the channel for the star template, iodine 
template, and data (measurement) frame respectively. As and Ai are wavelength-independent scal- 
ing factors that allow for flux differences between the templates and the respective data components, 
as in section r2.6.2[ Hence we find 

OstjAsFstj _ _ ait,jAiFitj 



a, 



sm,j 



F 



a 



im,j 



m,j 



F 



(46) 



mj 



where Fgtj and F[tj are the total fluxes across the channel for the star and iodine templates, and 
Fm,j is the total flux in the channel for the data frame. Substituting these effective visibilities in 



equation [M] gives: 



-D,smj 



'tJjimj 



rV2 
rV2 



OstjAsFstj 

VFmJ 



(47) 



a-itjAiFitj 

Using these we can now evaluate equation HJ] to obtain an estimate of the photon limiting error, so 
that: 



Sv = T^2 



E, 



VFr 



m,j 



astj^s-^st,j 



+ 



Ei 



VFr 



m,j 



aitjAiFitj 



where, due to flux conservation. As and A^ are subject to the constraint: 

^s-^stj' + ^iFitJ = Fra,j- 



(48) 



(49) 



Again we have found a quadrature summation of errors due to the star and reference compo- 
nents, scaled to match the respective component fluxes in the data, very similar to equation l43l 
However, in this case, the scaling factors. As and A^ must be determined as parameters during the 
velocity shift solution, and AsFst^j and A\F\t^j represent the fluxes in the star and iodine components 
of the data, respectively. 

This time, we do not attempt to assume channel-independent flux ratios. This is because for 
additively combined references it becomes possible to consider using emission spectra (e.g., a ThAr 
lamp) as the reference, rather than the usual iodine absorption spectrum. Clearly the flux ratio 
between data and reference template frames is very different for regions where there are no reference 
emission lines compared to those where emission lines are present. It is therefore not reasonable to 
take the flux terms outside the summation in the error combination function Ej. 

To gain a handle on the behavior of equation HSj we can see that if we consider only a single 
channel, so that for a function /, E{f) — )■ /, and assume both that the source and reference 



32 



visibilities are roughly equal (reasonable for star and iodine, to order of magnitude) and that the 
total flux Fm remains constant, then to minimize the total error, we need only to minimize the 
function: (AsFstj)~^ + (A-^it,j)~^- Given the constraint of equation [39] it is straightforward to show 
that this is minimized when AgFstj = AiF^j, in other words, when the component star and iodine 
fluxes are approximately equal. When either component has a very small flux compared to the 
other, one or other of the terms in equation [48] will become very large. Broadly speaking, then, we 
can see that the fluxes of star and reference need to be balanced in order to minimize photon error. 

In practice, rather than the total flux being constant, it is of more interest to hold the stellar 
component constant and vary the reference component to find the optimum; using real spectra, 
and allowing differing visibilities, the balance point becomes a little skewed from unity. The ex- 
act optimal balance point depends on the spectra in question. Allowing for the gain in optical 
throughput from losing the absorption in the gas cell reference, this equation at its balance point 
generally gives photon errors on a similar level to the photon errors for a multiplied iodine reference, 
if we use iodine spectra as references in both cases (i.e., tungsten-illuminated iodine in the added- 
reference scheme). Using the same observations as in section [3.1.11 to calculate error estimates as 
if the spectra had been added, and assuming that the flux level of the star in the template and 
the hypothetical combined observation is the same, we find an optimal ratio of iodine to star flux 
of 0.96 and a total photon error of 3.6 ms~^, comprised of iodine and star component errors of 
3.1 ms^^ and 1.8 ms^^, compared to the total error of 3.2 ms^^ for multiplied spectra. That the 
two are similar is not surprising: adding a reference spectrum to the stellar spectrum at a matching 
flux level will approximately halve fringe visibility and hence double the error, but also double the 
flux, reducing the error by l/-v/2, giving a total ^/2 increase in the error size. This coincidentally 
matches the increase in error size for in-beam-iodine calibration due to the fact that the iodine 
typically absorbs ~ 50% of the incident light. (The slight mismatch in the figures calculated is due 
to the fact that in the multiplicative case, the combined data frame actually had particularly high 
flux, probably because of better sky transparency at the time the frame was taken than when the 
template was taken). 

The above argument holds true for iodine since the continuum shape and fringe visibilities 
are broadly similar to those of the stellar spectrum. If we instead use a ThAr emission spectrum 
for the added reference, we appear to perform even rather better than the in-beam iodine case: 
the same calculations as above with a ThAr spectrum replacing the iodine spectrum yield a total 
photon error of 2.5 ms~^, with star and ThAr components of 2.4 ms~^ and 0.65 ms~^ respectively 
(with an optimum ratio of mean fluxes of 0.26 - now substantially different because of the very 
different nature of an emission spectrum). ThAr also shows a weaker dependence on relative flux 
level, which gives it an advantage in terms of practical application since less effort would need to 
be expended on matching the brightness to each target observation. This is likely because most 
of the Doppler information is primarily concentrated in a few bright lines in the ThAr, where it 
is spread more broadly across the stellar spectrum. Where the ThAr lines are strong, the stellar 
Doppler information is likely largely lost due to the added photon noise. However, since there are 
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relatively few such lines, there is not too much impact on the total stellar Doppler information, and 
increasing the ThAr flux does not make as large a difference as for the case of an iodine spectrum. 
We note, however, that at very high flux levels, saturation of the brightest ThAr emission lines is 
likely to complicate this analysis somewhat. 

Added-beam reference calibration provides one possible solution to the reference addition ap- 
proximation error discussed in section I3.3.3|, and the discussion here should provide a formalism for 
calculating the photon errors. Such a calibration approach, however, has not yet been attempted 
within the ET program, although basic simulations bear out these calculations. 



3.1.3. Photon Error with a Separate Reference 

Finally we consider the simple case where there is no simultaneous common-path reference, 
but rather a reference separated either spatially or in time. Once again, we find a find a weighted 
mean velocity shift between (now pure) star measurement and some reference star template, and 
the same between a pure reference spectrum measurement and a corresponding template. (The 
reference need not be iodine, but we retain the 'i' subscript notation for consistency). The results 
are differenced to obtain a corrected intrinsic stellar Doppler shift. If we neglect the template errors 
as before, then the photon errors for the data frame are found again similarly to equation UT) 



^v - [Ej{ev,s,j)] + [£'j(ei>,ij)] • 



(50) 



The difference is that here we use subscripts "s" and "i," rather than 



sm" and "im," to indicate 
that we are no longer looking at components of a combined reference/star measurement frame, but 
at pure star and pure reference measurements respectively. Again substituting the basic photon 
error equation, [Ml we obtain: 
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(51) 



The form of the equation is now much simpler, since no flux or visibility scaling is required. 
Errors again go as the inverse of the visibilities of star and iodine, and as the inverse root of the 
flux. It may also be the case (indeed, observations should be taken such that it is the case) that the 
reference spectrum has significantly higher S/N, and therefore its photon errors can be neglected, 
so that only the first error combination term in the square root remains. 

Taking our same data and templates once again, we can calculate a hypothetical photon error 
for comparison: this time, using iodine as a separate reference yields a total error of 2.9 ms~^ 
comprising star and iodine components of 2.3 ms~^ and 1.9 ms~^ respectively (note that the iodine 
error level here is relatively high in comparison to the star component: this is purely because of the 
exceptionally high flux from the star in these particular observations); using ThAr instead yields a 
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total error of 2.4nis~^, with star and ThAr components of 2.3 ms~^ and 0.77 ms^^ (though again 
we have not included the effects of saturation in the ThAr calculation, which may increase the 
ThAr errors somewhat). 

This approach is appropriate to the MARVELS/Keck ET, where pure star science exposures 
are bracketed in time with pure iodine reference exposures, and the instrument is highly stabilized 
in both pressure and temperature. The baseline design requirements antic ipate a photori error of 



3.5ms ^ at y = 8 mag, and 21.5ms ^ at F = 12 mag (see section [Ol and iGe et al.ll2009l ) 



3.2. Additive Contaminating Spectra 

It is often useful to be able to calculate a rough estimate of the errors due to contaminating ad- 
ditive background spectra. We derive a formalism for doing so here. This formalism will enable us 
to calculate the effect of background moonlight contamination, contaminating background stars, or 
double-lined spectroscopic binaries, for example. In addition, we will then be able to extend the for- 
malism to treat multiplicative (i.e., flux independent) contaminants, such as any residual unfiltered 
comb presence or the iodine/star cross-talk term that causes the reference addition approximation 
error, and try to assess their relative significance. 



3.2.1. Derivation 

Figure [5] shows a fringe along one detector column (in the slit direction) due to the target 
source alone, with fringe amplitude as, mean flux Kj, and phase c/ig. For simplicity we assume no 
iodine fiducial reference, since we are only aiming for an order-of-magnitude estimate. A second 
contaminating fringe of lower amplitude a^ and mean flux F^ due to background contamination is 
also shown, with phase (/>c. If the spatial frequency of the fringes is /, then the summation of these 
two fringes will give the total (also sinusoidal) measured fringe: 

Fs + 3f?{ase*(^^+*=)} + Fc + ^{a^e^^^'^'+'t'-^} 
= Fs + Fc + 3f?{ase*(^^+'^=) + ace^(^^+<^-)}, (52) 

where x identifies position along the slit. F^ + F^ represents the mean value of the measured flux. 
The last term represents the varying sinusoidal net fringe. 

We are interested in the phase error, e^, introduced into the measured fringe by the contami- 
nating spectrum. Since we are only interested in the phase information, we ignore the offset term 
Fs + Fc, and represent the varying term as a vector summation, as shown in figure [6l where a^ 
and Oc represent the source and contaminant fringe amplitudes as before. The angle /S.(j) is the 
difference between the source and contaminant fringe phases, Ac/) = (f>c — (ps- Using the sin and 
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Pixel number, x 

Fig. 5. — Fringe along one channel due to source (upper curve) and contaminating low flux fringe 
(lower curve). Measured fringe is a summation of these two fringes. 



cosine rules for triangles we can show 

sine^ 



sin A0 



■\/al + flc + ^OsOc cos A(/) 



(53) 



First we consider the case that the two spectra are of similar form and very close in velocity, 
so that A(p is small. Then, 

sm Etf, f« . (54) 

Os + ac 

If we assume the source and contaminant fringe visibilities are approximately equal, so that Cs/Kj ^ 

Qc/Fc, then Oc/ag ~ Fc/Fg, which is equal to the flux ratio of the two fringes. If the contaminating 

fringe is much fainter than the source, so that Fs ^ Fc, therefore Og ^ ac, and hence e,^ is small, 

then 



sm e^ ^ e^ 



'-A(j) 
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F 



A(t). 



(55) 



Still assuming that the two spectra are of similar type and velocity, then all wavelength channels 
will see approximately the same phase difference between source and contaminant fringes, and this 
error will be systematically close to the same across all channels. Therefore, the same result will be 
expected flnally even after averaging over all channels. Since Ac/) is proportional to the difference 
in velocity, Av, between source and contaminant, then we can write the final error in the measured 
velocity, e„, simply as: 



Fr 



£y 



{F,/F,)Av ^ ^Av 



(56) 



where now Fc and Fg signify mean fluxes for the entire spectra, rather than for individual channels. 
In other words, the systematic velocity error due to a faint contaminant of similar spectral type 



36 




Contaminant 



Fig. 6. — Vector representation of the summation of the fringes due to the target source and 
background contamination. 



that is closely matched in velocity is simply the velocity difference scaled by the flux ratio of the 
contaminant to the source fringe. (The approximation made in dividing the means in the second 
form of this equation is good to first order, and provides a very convenient way to quickly estimate 
the errors. See appendix [C] for a derivation and discussion of when it is more appropriate to use 
the first form of the equation. The same approximation is also made use of several times below.) 

This relation does not hold well to arbitrarily large velocity differences, however. From the 
geometry of figureOit can be seen that a worst case scenario is where the contaminant in all channels 
is systematically offset by an amount such that the background contaminant vector is perpendicular 
to the measured vector (or, approximately, where Ac/) = '/r/2). In this case, e<^ Ri CLc/as ~ Fc/Fs, so 
that: 



£y 



Fs 



(57) 



where F is the phase/velocity scaling factor. Hence the 'worst case scenario' error, where the 
velocity offset between source and contaminant is the worst possible and the two spectra are very 
close in form, is again simply proportional to the contaminant-to-source flux ratio. 

In the limit that the spectra are completely dissimilar, or are sufficiently separated in velocity 
space that overlapping features are in no way correlated, then the phase errors will be randomly 
distributed across all channels. Following again from equation [53| we once again assume Fs ^ -Fc 
and Os ^ flc) which allows us to neglect the a^ and cosA0 terms; and again that on average 
ttg/Fs w ac/Fc =^ ac/os w Fc/Fg. Now, however, taking A<p as uniformly randomly distributed, we 
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can find the rnis value for the phase error in one channel as: 

rms(sine^) « rms(e0) ^ rms I — sin A(/) ) = -f^—r- (58) 

Assuming an average over n independent channels gives a l/\/n reduction in the final error, so that 
for uncorrelated spectra, we can expect a final velocity error of: 



Ey 



Fc 



2n K 



(59) 



where F is the phase- velocity scale factor for the instrument. The error is now independent of 
differential velocity between source and contaminating spectra, since the two spectra no longer 
bear any relation to each other (although it may be expected to vary systematically on velocity 
difference scales corresponding to the line widths). 



3.2.2. Application to Moonlight and Stellar Contamination 

Equations [56l [571 and [59] can be applied directly to estimate the magnitude of the errors 
introduced by background scattered moonlight contamination. As an example, a 3" fiber with a 
bright-time sky background of 19magarcsec~^ due to scattered moonlight from the atmosphere 
gives a total of 16.9 mag of sky background. For a magnitude 12 star, this gives a source-to- 
contamination flux ratio of about 90. In the worst case scenario, from equation [571 assuming 
r ~ 3700ms^^ rad^^ (corresponding to a 7mm delay), we find e^ ~ 41ms~^. This will apply 
where the stellar spectrum is similar to the moonlight spectrum (not uncommon, since most targets 
are sun-like), and in the case where the velocity difference between star and moonlight. Aw, is 
coincidentally around ~ 6kms~^. At smaller velocity differences, the error will scale roughly 
linearly as e^ = At;/90 up to this point (equation [56]) . After that, it will improve again as lS.(j) 
increases to vr, where the phase error once again approaches zero. As Ac/) increases, the behavior is 
likely to be somewhat oscillatory, with a period of 2ttT = 2.3 x 10^ ms^^ owing to the geometry of 
figure [HI decaying until Aw is large enough that the two spectra are completely uncorrelated. For 
n = 1000 independent wavelength channels (i.e., 4000 pixel channels with an LSF ~ 4 pixels wide), 
the error should then approach equation I59[ with an rms value of around e^ w 0.8 ms~^. This 
should also be the typical error size when the star and moon spectra are very different in form. 

Simulations of the effect of moonlight contamination show reasonable agreement: figure [7| 
shows the RV deviation caused by synthetic moonlight contamination added to a synthetic stellar 
spectrum, and then multiplied by the interferometer response function and degraded in resolution 
to simulate real instrument spectra (ignoring the iodine reference). The resulting spectra are run 
through the standard ET reduction pipeline to assess the effects of the contamination. 

The instrument parameters given here are chosen to match the parameters of the simulation, 
which reflect a typical ET-like instrument design. For comparison, the MARVELS/Keck ET in 
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Fig. 7. — Simulations of moonlight contamination, showing the systematic error introduced by 
contaminating moonlight at 19 mag arcsec"^ for a V=12 F9V star on a 3" fiber, as a function of 
velocity difference between target star and moon spectrum. 
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fact uses 1.8" fibers, with F ~ 3400 ms^^ rad^^, leading to a worst-case error of only 14ms~^, at 
the faintest end of the MARVELS range. Reducing the fiber size is clearly one effective way of 
mitigating the effect of moonlight contamination, although this may be at the expense of throughput 
if telescope guiding or seeing is not optimal. Reduction of moon contamination error is discussed 
further in section 14.11 

In exactly the same way, we can calculate the effects of contamination by a background star: 
for example, a background star of the same spectral type and class, but 5 magnitudes fainter 
(i.e., fainter by a flux ratio of 100) would give about the same level of error. At increasingly 
different spectral types, the contaminant star will cause less of a problem as the spectra become 
less correlated. Hence for a faint companion (as opposed to background) star, although the flux 
ratio may be higher, the effect will be at least partially mitigated by the difference in spectral type. 



3.3. Multiplicative Fringe Contamination 

In addition to additive contaminating spectra, certain errors can appear as multiplicative effects 
in the fringing spectra. These are independent of flux and correspond more closely to fringe errors 
rather than flux errors in the spectra. Residual interferometer comb, for example, will behave in 
this way (a concern for multiplied-reference modes of operation), and the cross talk term from the 
reference addition approximation can also be considered in the same way. 



3.3.1. Derivation 

We can follow the same formalism as for background spectrum contamination. In this case, 
however, instead of the source and contaminant fringe visibilities being similar, the fluxes are similar, 
so that Fg, ~ Fc. Dividing the denominators of equation 1531 through by Fg ~ i^c; we can replace 
the fringe amplitudes a^ and a^ with their respective visibilities a^ = ag/Fg and ac = a-c/Fc- The 
source spectrum and interferometer comb are completely unrelated in form. Assuming ckc <C Og we 
can follow the same reasoning as for equation [59] and write: 



Ey 



Uc 



2n Os 



(60) 



where a^ and a^ are representative visibilities for the entire contaminant and source spectra respec- 
tively (again see appendix ICl regarding the division of means here). The error is now proportional 
to the ratio of visibilities, and again decreases with the root of the number of spectral channels, n. 
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3.3.2. Application to Residual Interferometer Comb 

As an example application, we can consider the effect of residual interferometer comb. For 
the case of using a multiplied fiducial spectrum such as in-beam iodine absorption, if the interfer- 
ometer comb (section 12. 4p is not completely removed - either by careful tuning of interferometer 
delay and slit width or by Fourier filtering in the data processing - then it acts as contaminating 
fringes. In order for the reference addition approximation to work, the comb term in equation [22] 
must be completely removed (see the subsequent discussion). This has in the past been an issue 
with some configurations of the ET instruments, for example: in these cases the sampling by the 
resolution element was such that the comb was aliased in places, creating a low frequency pattern 
in the dispersion direction which was impossible to filter out without losing significant Doppler 
information. 

Here, it is appropriate to take the combined star/iodine data as the source spectrum, since the 
comb error arises in the formula for the combined data (equation [22]) . A residual comb visibility 
of 0.5% (c.f. ~ 1% comb visibility in unfiltered KPNO ET data) on top of a spectrum of typical 
mean visibility of say, 4%, and taking the KPNO ET value of F ~ 3300 ms~^ rad~^ with n = 1000 
independent channels, would give an expected error of e^ w 9ms^^. 

In practice, however, we have found that, provided comb aliasing is avoided in the instrument 
design and alignment, removing the interferometer comb during image preprocessing with a simple 
one-dimensional low-pass Fourier filter appears to be effective in completely mitigating this error. 
The comb cannot be measured or seen by eye above the photon noise in filtered continuum lamp 
spectra, and we have not yet found any evidence of residual comb causing problems in the final 
data. 



3.3.3. Application to the Addition Approximation 

In order to estimate the errors introduced by the addition approximation discussed in section 
12.6. 1[ we can also follow a similar approach, treating the cross term from equation [22] which is 
ignored in the approximation (or rather, treating the lack of cross term) as if it were a contaminating 
spectrum. First, we consider the simplified case of two discrete overlapping Gaussian absorption 
lines, from template spectra labeled A and B (e.g., an iodine and a stellar line), combined by 
multiplication to give the measured spectrum, labeled M. Both line centers are exactly coincident. 
The fractiona l line dept hs are represented hy D (0 < D < 1), with corresponding subscripts a, b 



and m. From iGd ()2002l ). we have that in general: 

a = Z)e-3-56'^'A? ^ KD, (61) 

where a is the absolute fringe visibility (so that the complex visibility is q = ae^"^ as usual) , d is the 
interferometer delay, Ic = A^/AA is the coherence length of the interferometer beam with line width 
AA at wavelength A, and K = exp(— 3.56(i^//^) is a constant (for a given wavelength). Although 
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not very realistic, we begin by assuming both lines A and B and the resulting line M are of similar 
width, and that the measured line, which is the product of the two lines, is also approximately 
Gaussian. K is then approximately the same for all three lines. We can then write: 

a^ « D^K = [I - {I - D^){1 - D^)]K 
= [D, + D^-D,D^]K 
= a^ + a^-D^D^K. (62) 

In the addition approximation, the complex visibilities of the template spectra are added 
together. In this simple case, the two lines are centered at the same wavelength and both are 
Gaussian, so that one line is simply a scaled version of the other. By the linearity of Fourier 
transforms, this means that the phases of the two complex visibilities must be identical, so that in 
the addition approximation, the two absolute visibilities add to give Qm ~ aa + Ob- The remaining 
term in equation [62| is therefore approximately the error, Og, the difference between the added 
templates and the actual measured visibility: 

as = D^D^K (63) 

In the more general case that the two line centers are not exactly coincident or the same shape, 
so that the respective template fringes are not in phase, the error term will also include a phase 
difference, becoming a two dimensional vector, Qge^*^^. Taking the error term above as a reasonable 
estimate of the length of this vector and assuming (p^ is uniformly randomly distributed, we can 
calculate a corresponding representative error in phase of the summation approximation. Figure 
[8] shows the addition of the "true" (measured) complex visibility and the error term to give the 
solution according to the summation approximation, (j) represents the phase of the true complex 
visibility, and e^ represents the error in the measurement of that phase. If we assume the resulting 
measured visibility vectors and the error terms are uncorrelated from channel to channel, and if we 
take Da and Db to be some kind of representative average line depth for the two spectra across all 
J, we can derive the typical expected velocity error following the same reasoning as for equation [601 
and write: 

r a. 



2n Om 



2^D^ + D^-D^D^' 



(64) 

where n is again the number of independent channels, and the constant K cancels. (Note that 
although angles (p^ in figure [8] and A(/> in figure [6] are measured from different origins, they are 
in both cases taken to be uniformly randomly distributed between and 2it, so that the same 
reasoning applies for both.) 

Hence we find again that the error decreases as the square root of the number of spectral 
channels; unsurprisingly, it also increases with line depth, since deeper lines allow for more cross 
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Fig. 8. — Vector representation of the summation of the true complex visibiHty and the error term 
due to the addition approximation. 



talk. If, however, either one of the representative line depths is very small, then the velocity error 
also becomes small, becoming approximately linear with the smaller line depth and independent of 
the larger as the smaller tends to zero. 

Figure [9] shows the expected typical error as a function of average line depth for the simplified 
case where the typical depths of the two spectra are equal, and taking P ~ 3300 ms~^ rad~^ and 
n = 1000. For average line depths of, say, 80% for both star and iodine, this gives a typical error 
due to the addition approximation of ~ 50ms~^, which is clearly very significant. The error will 
manifest as a systematic error in the velocity response of the instrument, essentially adding noise 
which varies as a function of the specific overlapping of the lines between target star and reference 
spectrum. It will therefore vary with stellar spectral type, class, and line width, and will also vary 
as a function of the intrinsic absolute Doppler shift of the stellar spectrum. Since the stellar lines 
are generally considerably broader than the iodine lines, if the stellar lines slowly shift relative to 
the iodine lines, the noise term will slowly change until the point where a shift of more than a 
stellar line width has been reached. At this point, the stellar lines are overlapping completely new 
iodine features, and the noise term will take on a new value that is completely uncorrelated with 
its previous value. Hence, we expect a non-linearity in the velocity response of the instrument, 
with a standard deviation somewhere on the order of 50ms~^ and that varies with Doppler shift 
on a scale of approximately the line width of the star. For solar-type stars observable with ET, 
this variation will be over scales typically on the order of 5-10 km s~^. 

Figure [10] shows the results of simulated fringing spectra run through the reduction pipeline 
to see the effect of non-linearity due to the addition approximation, and shows broad agreement 
with these expectations. (For the simulation, the phase- velocity scaling factor F ~ 3700 ms~^ was 
used; for this value of F, our previous calculation yields ~ 55ms~^.) 
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Fig. 9. — Analytically calculated expected error due to the addition approximation, assuming 
approximately equal line depths for both star and reference spectra. 



Clearly the addition approximation is a very significant source of systematic error, and cannot 
be neglected. The systematic error will affect any DFDI instrument that depends on in-beam 
multiplied reference spectra. Various approaches to correct the error are under consideration, 
although an exact analytical solution - if one exists - remains elusive; for the MARVELS survey 
and the current KPNO ET (now undergoing upgrade) simultaneous in-beam iodine calibration is 
simply avoided, instead relying on good instrument stability and bracketing exposures in time with 
reference iodine frames to calibrate out instrument drift. Possible approaches to dealing with the 
addition approximation error are discussed in section 14.21 and alternative calibration methods that 
circumvent the approximation altogether in section B31 



4. Discussion 



4.1. Moonlight Contamination 

As surveying for exoplanets down to fainter and fainter magnitudes during bright-sky time 
continues with the MARVELS project, moonlight contamination is likely to become an important 
issue (keeping fiber diameter small and avoiding bright time notwithstanding). With the current 
ET instruments and pipeline, it is unlikely that direct subtraction at the whirl or the initial image 
stage would be successful. Bracketing science exposures in time with sky exposures to measure 
the background would seriously impact the observing cadence, reducing on-sky exposure time by a 
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Fig. 10. — Simulations showing the addition approximation error. The non-linearity in the RV 
response has the same order of magnitude and appears on the same input velocity scales as expected 
from theoretical predictions. 
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factor of two or more, and would likely suffer from rapidly varying sky background in the presence 
of even thin cloud. Using simultaneous sky-fibers alongside the science fibers for direct subtraction 
in image space would require extremely precise modeling of the instrument to successfully map the 
spectrum from one fiber onto another. For subtraction in whirl space there is insufficient flux from 
such a faint background to be able to successfully measure meaningful whirls, at least using our 
current data analysis techniques. 

Given adequate templates of the moon spectrum, however (a solar spectrum may suffice), then 
it may be possible to model out the moon error by treating it as a further additive component in 
solving for the stellar Doppler shift, just as for the case of added reference spectra. This could 
in principle be done alongside any simultaneous reference spectra, and since the moon spectrum 
is an additive component, it would not suffer from the addition approximation errors associated 
with an in-beam calibration source. Alternatively, currently under consideration for reducing ET 
data, forward-modeling from high-resolution spectra to match the measured whirl data (or even 
the fringing spectrum images) could allow for moon contamination to be included as a part of the 
model. 

The original KPNO ET having been designed for brighter sources where moonlight is less of a 
concern, approaches to mitigation of moonlight contamination are still under investigation. 



4.2. The Addition Approximation Error 

One of the most significant concerns with the DFDI technique for exoplanet searches when 
using superposed iodine is clearly the addition approximation error. Causing long term systematic 
errors on the scale of up to ~ 100 ms~^, the approximation can potentially have a serious adverse 
effect on the measurement of exoplanet RV signatures. The effect can be mitigated to a certain 
extent simply by judicious selection of observation times and positions of targets on the sky, so 
that the line-of-sight barycentric motion of the Earth - usually the dominant effect that causes the 
non-linearity to become significant - is minimized. Such observations are often not hard to achieve 
at least over periods of a few days. This explains why we were e arlier still able to make successful 



detections of 51 Peg b and HD 102195 b (Ivan Eyken et al.ll2004al : lGe et al.ll2006l ) despite then being 



unaware of the effect: for both targets, the sky positions and epochs of observation were such that 
the change in barycentric correction over the lengths of the individual observing runs was small 
compared to the variation scale of the addition error, so that the addition errors were absorbed 
in small corrections to the phase-velocity scale. Nonetheless, the addition approximation error 
becomes significant for velocity shifts on scales upward of the line width of the stellar spectrum, 
and placing stringent constraints on the times of observation is likely to cause serious aliasing issues 
due to the observation window function. Furthermore, over 24 hours, the barycentric motion of 
the observatory contributes a variation of up to ~ lkms~^ in radial velocity due to the Earth's 
rotation, and up to 60 km s^^ over a year due to the Earth's orbital motion. For the slowest rotating, 
most narrow-lined (and hence best Doppler precision) stars, the line width may be on the order 
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of Ikrns"^, so that even over one night the error is of concern. Forcing the observing cadence is 
therefore certainly not a satisfactory long term solution. 

Clearly, a robust solution to the problem of the addition approximation error needs to be 
found. The ideal would be to find a mathematically exact solution to the spectrum combination 
equations - or at least a more accurate approximation - but this remains elusive, and it is not 
clear whether such a solution even exists. Modifying the template whirls cannot solve the problem, 
since the templates themselves are correct: to solve the problem, the cross-talk term discussed in 
section [2. 6. II that constitutes the error must be directly calculated, or at least approximated. Hence 
iterative approaches which perturb the template whirls in an attempt to minimize the residuals 
will only end up introducing error in order to fit the cross talk term. 

One way or another, calculating the cross-talk term seems to require knowledge of the un- 
derlying high-resolution spectrum of the two templates. Efforts to model the error term using 
high-resolution iodine and synthetic stellar spectra have shown some promise. In this approach, 
the cross talk term is calculated directly, so that a grid of corrections across velocity and stellar 
parameter space can in principle be created to apply to real data. Alternatively, appropriately 
parametrized high resolution spectra could be forward modeled to match the data from the in- 
strument, using the formalisms presented previously: as well as accounting for the cross talk term 
naturally as part of the process, this could also allow for addition of a third spectrum in the model 
to account for sky background in an attempt to remove moonlight contamination. This would, 
however, likely require extremely precise modeling and calibration of the instrument, losing the 
benefit of the self-calibrating nature of real templates. 

Alternatively, one can consider trying to obtain the high resolution information from the data 
itself. Two possible approaches to recreating the underlying spectrum are as follows: one is to 
begin with the low-resolution non- fringing spectra obtained from the DFDI fringing spectra (e.g., 
by binning the spectrum in the slit direction) as an approximation, using this to help model the cross 
talk, and then iterate with successive perturbations to the spectrum until the residuals between real 
data whirl and the sum of the teni p late w hirls and the cross-talk correction are minimized (similar 
to the approach by Ijohnson et al.l (120061 ) used to measure RV's without formal templates using a 



traditional spectrograph). All the information necessary to reconstruct the underlying spectrum 
may not be present in the cross talk term, however: one can imagine degeneracies, for example, 
where two closely spaced absorption lines within a resolution element may lead to the same fringe 
phase and visibility as a single line of a different depth positioned midway between the two. It may 
therefore not be possible to iterate towards a single solution based on a single data frame. However, 
once multiple observations at different RV's have been taken, where the stellar lines overlap different 
parts of the reference spectrum, we might conceivably be able to use the aggregate information to 
help break any degeneracies. The more data measurements, the more accurate becom es the estimate 



of the underlying spectrum. This is somewhat analogous to the concept employed by lKonacki et al 



(|2009l ) in improving individual star templates from double-lined spectroscopic binaries. 
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The second approach which is hkely to be simpler is to try and obtain an improved estimate of 
the cross-talk term by calculating it based on template spectra reconstructed at higher resolution 
than the nominal spectr ograph resolut i on by using the information in the fringes. Such a recon- 



struction is described bv lErskine et alj (J2003l ). Although having a single fixed delay constrains the 
degree to which high-resolution information can be recovered, it could at least provide a first-order 
correction for the addition approximation error. 

At the time of writing, all these approaches represent avenues for further investigation; the 
best solution may involve some combination of the above. 



4.3. Alternatives to Multiplied References 

Instead of attempting to calculate or model the addition approximation error, an alternative 
is to circumvent the problem altogether by considering different instrumental approaches to RV 
calibration. 

One such approach would be 'combined beam' superposition of the reference spectrum, where a 
reference is literally added to the stellar spectrum, for example, by splicing two input fibers together 
into one. In this way, there is no longer an approximation in combining the template whirls: the 
equations are exact (section 12.6.2^ . Combined-beam superposition does run the risk of adding 
photon noise to the stellar spectrum: for an absorption reference (e.g., a tungsten-illuminated 
iodine cell), the star and reference spectra need to be balanced in flux. This adds complexity to 
the observing in that it requires advance knowledge of target fluxes and careful preparation for 
observations, and is more likely to be practical for a single-object than a multi-object instrument. 
Calculations of the added-reference photon limit based on real KPNO ET fringing spectra suggest 
that provided the spectra are properly balanced, similar precision can be obtained as for a multiplied 
reference spectrum at a given S/N in the data, when one allows for the gain in flux from the lack 
of reference absorption. Similar test calculations with ThAr emission as a reference show that the 
photon noise is less sensitive to the relative intensity of an emission spectrum than an absorption 
spectrum (see section 13.1.21) , relaxing the requirements on intensity matching: such an approach 
may therefore overcome some of the complexity of combined-beam observations and may even be 
practical for a multi-object instrument. 

Another intriguing question is whether the interferometer comb itself could be used as a fiducial 
reference instead of iodine or ThAr. Changes in the interferometer delay will shift the phase of the 
comb, and so it can in principle track instrument drift, provided the comb and star signals can be 
adequately separated. The problem lies in the symmetry of the comb: as a simple example, if the 
image on the detector were to drift in a direction exactly parallel to the comb, the stellar fringes 
would appear to shift in phase and in the dispersion direction, and yet the comb would appear not to 
have changed at all, leading to the incorrect conclusion that the shift is wholly intrinsic to the star. 
If the image on the detector can reliably be stabilized to sufficient accuracy in either the slit or the 



48 



dispersion direction (or both), then there would be sufficient information to break the degeneracy 
between stellar and instrument shift, and the intrinsic RV could be measured. This would be a 
big step forward, allowing simultaneous common-beam calibration with neither flux loss (as in 
iodine absorption) nor photon noise addition (as in ThAr super position), and obvia ting the need 



for any reference spectrum at all. (The USNO dFTS instrument (JHajian et al.ll2007l ). with its lack 
of dependence on simultaneous in-beam calibration, is somewhat similar in this respect, although 
a precise metrology system is needed to measure the varying interferometer delay.) As pointed out 
by our anonymous referee, the degeneracy could conceivably be broken given a sufficiently large 
spectral bandpass: the wavelength dependence of the comb frequency (see, e.g., figure ED would 
allow for the measurement of the image shift in the dispersion direction. The large bandpass, 
however, would need to be balanced with the requirement that the comb be resolved well enough 
to be measurable, which may be hard with standard CCD detector sizes unless longer wavelengths 
are used (since longer wavelengths exhibit a lower comb density in the dispersion direction). 

Finally, if instrument stability can be controlled well enough, simply running parallel reference 
spectra alongside the stellar spectra, or alternatively, bracketing stellar exposures in time with 
reference exposures, can provide another solution: this has the twofold benefit of vastly simplifying 
the data analysis and eliminating the significant throughput loss (~ 30%-50%) due to insertion of 
an iodine cell into the beam path. This is the current method of choice for the MARVELS survey, 
and is also currently employed by the KPNO ET. The MARVELS/Keck ET instrument is pressure 
stabilized, and thermally controlled to the few-niK level, allowing for very good instrument stability. 
Although not as precise as simultaneous common path calibration, the results are adequate for the 
moderate-precision large-scale survey for which the Keck ET is intended. On-sky results with the 
Keck ET show that this approach is feasible, and exposure bracketing is to be employed in the full 
survey. 



4.4. The Technique in Practice 

Beginning with the confirmation of 51 Peg b, and with the later discovery of HD 101195 b, 
the ET instruments have convincingly demonstra ted the capacity of dispersed fixed-dela y inter- 



ferometry for exoplanet detection and discovery (jvan Evken et al.l l2004al : iGe et al.l |2006| ) . Even 
in the presence of the addition-approximation error, both the single-object ET at KPNO and the 
multi-object Keck/MARVELS ET at APO have been able to routinely uncover the RV signals of 
known exoplanets. The early KPNO ET, using in-beam iodine, demonstrated photon-limited pre- 
cision at the 2-3 ms~^ level with bright reference stars on the very short term (see earlier, figure 
H]). Figure [m shows observations of r] Gas, an RV-stable star, using the same instrument over a 
longer period of several months, with an rms of 10.8 ms^^ (compared to a mean photon limit of 
5.6 ms~^). Evidently in this case, the addition approximation error due to the iodine was not too 
extreme. For comparison, typical rms measurements on bright reference stars were on the order of 
8-10 ms~^ over typical observing runs of ~ 1 week, where the addition approximation error would 
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Fig. 11. — Differential RV measurements of r] Cas, an RV stable star, using the single-object KPNO 
ET. rms scatter is 10.8 ms~^; error bars indicate the size of the photon error (mean 5.6 ms~^), and 
do not include correlated systematic errors. 



normally be quite small (and to some extent absorbed in determination of the phase-to-velocity 
scale, r). 

In anticipation of the upcoming MARVELS survey, the Keck instrument saw a major up- 
grade in 2008, with a more stable mechanical design, pressure stabilization, and extremely pre- 
cise thermal control, rendering the instrument stable enough to make exposure bracketing with 
tungsten-illuminated iodine reference spectra feasible. As a result, the addition approximation 
issue is eliminated and throughput substantially increased (albeit at the expense of some loss of 
precision due to the separated target and fiducial light paths). The baseline requirements for the 
MARVELS survey with this approach call for rms errors of 14ms~^ and 35ms~^ at y = 8 mag and 
V = 12 mag res pectively, with corresponding photon error components of 3.5 ms~^ and 21.5 ms~^ 
( Ge et al.ll2009l ). HD 9407 (stable, V = 6.6 mag) shows an rms of 11.3 ms~^ over four months, 
fairly typical for stars at this brightest end of the target range; current typical performance shows 
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an rins of 15nis ai V = 8 mag, and 42 ms ai V = 12 mag. 

The KPNO ET is currently being upgraded, and in light of the addition approximation er- 
ror, observations are also now taken in an iodine bracketing mode, with no s imultaneous cal- 
ibration. Further details and results from both instruments can be found in van Evken et al 



(2004a 



^: JOe et al] koQ(h : iMahadevanI kood ): Ivan EvkenI ^j); iMahadevan et al 



Ge et alj ( 20091 ). Such precisions are adequate for finding planets with minimum masses (Msini) 
of order IMj or more in few-day orbits (i.e., hot Jupiters) down to y = 12. They are also more 
than adequate for uncovering stellar binary and brown dwarf companions. 

We have presented here an overview of a mathematical basis for understanding DFDI data for 
precision radial velocity measurements, and discussed analytical approaches to some of the error 
sources that would affect any implementation of the technique. The formulae derived should prove 
useful for interpreting the data from any future implementations of such instruments. As the ET 
instruments' overall precision and reliability continues to improve, it is our hope that the DFDI 
technique will be able to make a significant contribution to the known extrasolar planet sample 
over the coming years. 
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A. The Spectrograph Response Function and the LSF 

The response function Wj{X) due to the spectrograph optics is very closely related to the instru- 
ment LSF. As defined for the purposes of this paper, the spectrograph response function specifies 
the respective instrumental throughputs for the finite range of wavelengths. A, falling at a given 
position in the dispersion direction, x = j, on the detector, corresponding to a spatially infinitesi- 
mally narrow channel in the spectrum. By contrast the LSF specifies the fiux distribution on the 
detector as a function of spatial position in the dispersion direction due to a single monochromatic 
wavelength of light. The response function at a particular position is therefore determined by the 
way the LSF's from all the different wavelengths overlap at that position. 

If we assume the LSF is approximately identical in form at closely separated channels x on the 
detector, where x represents the pixel position in the dispersion direction (not necessarily integer), 
then we can define the LSF as L(x,xo(Ao)) = Lt{x — xo(Ao)) where L represents the normalized 
envelope of flux spread across pixels x on the detector due to monochromatic light of wavelength Aq, 
centered at position xq. L is simply a shifted version of Lt{x), which represents a template of the 
LSF centered at x = 0. In general x{X) represents the wavelength calibration mapping wavelength 
to detector position. 

The response function at an infinitesimally wide position on the detector, w, is given by writing 
the contribution from each overlapping LSF at that position. The contribution from wavelength Aq 
at position xi is given by w{Xo,xi) = L{xi,xo{Xq)) = Lt{xi — xo(Ao)). Therefore as a continuous 
function of general wavelength A, we can write the response function at position xi as 

wiX,xi) = Ltixi-x{X)). (Al) 

We see that this is really just the LSF reversed (since the x term is now negative). 

We can now extend this to the total contribution across an entire pixel at position x = j where 
j is now an integer representing pixel number. Let tj{x') represent the pixel response function. 
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describing the normalized throughput of the pixel across its width as a function of x' . Then we can 
write w{\,x')tj{x')dx' = Lt(x' — x{X))tj{x')dx' . Summing over all x', we have the full response 
function Wj for pixel column j, given by: 

Wj{\)= f Lt{x' -x{X))tj{x')dx', (A2) 



i.e., essentially a convolution of the response function with the pixel response function. To the 
extent that the width of the pixel is narrow compared to the LSF (i.e., that the image is well 
over-sampled), then to a reasonable approximation, tj is close to a delta function, W ^ w, and 
the instrument response function at position x is approximately just the reversed LSF. Analogous 
arguments can be followed in wavenumber (k) space instead of wavelength space, simply replacing 
A with K to obtain the same exactly the same results as a function of k. 



B. Fringe Formation: an Alternative Vie^vpoint 

We can view the formation of the fringes as given by equation [8] in another way. Beginning 
again with equation [1] for the flux at a given channel, j, as a function of delay d, and again 
substituting Q, we can write 

Ij{d) = I Q{k) K{1 + e-^27r«d| ^^ 

= / P{k)wj{k)[1 + cos(27r«;d)] d/t, (Bl) 

where again the spectrum within the channel is given by Q{n) = P{k)wj{k), with P{n) being 
the full spectrum entering the instrument, and Wj{k,) being the spectrograph response function for 
channel j. 

If we assume that the spectrograph response function is uniform across the whole spectrum 
(i.e., for all j), then we can express it as a wavenumber-shifted version of a global 'template' 
spectrograph response function, wt (which is centered at k = 0), shifted so that its center is at 
the central wavenumber of the channel in question, kj. From appendix |Al we know that the 
spectrograph response function is just the reverse of the LSF, so we can write 

■Wj = Wt{K — Kj) = Lt{Kj — k), (B2) 

where Lt represents a global template LSF, also centered at k = 0. If we furthermore define 
T(K,d) = 1 + cos(27rKd), we can therefore rewrite equation IB II as 

Ij{d) = / P{k)T{k, d)Lt{Kj - k) dK. (B3) 

T can be thought of as the interferometer transmission function, equivalent to the interferogram 
that would be obtained for pure white light and an infinite resolution spectrograph (exactly as in 



-53- 

equation ll2p . Equation IB3I can be identified as a convolution over the variable k: 

Ij{d, Kj) = [P{k) T{k, d)] Lt{K), (B4) 

where the convolution is evaluated at wavenuniber Kj. 

In other words, we have simply the input spectrum multiplied with the interferometer trans- 
mission function, and then convolved with the LSF due to the spectrograph. Thinking in two 
dimensions, to match the wide-slit format of the actual ET spectra, we can replace the LSF with 
its two-dimensional equivalent, the instrument point spread function (PSF). Exactly the same re- 
sults can be derived in frequency space, simply by substituting frequency v for k and time delay r 
for d. 



This way of lo oking at fringe formation is the approach used bv lErskind (J2003l ) and followed by 



MahadevanI (|2006l ). and can conveniently be employed to quickly produce simulated DFDI spectra 
(although with the caveat that it assumes a uniform LSF, which in practice is unlikely to be very 
realistic). 



C. Division-of-Means Approximation 

In section 13.21 we make an approximation regarding the magnitude of velocity errors resulting 
from a contaminant spectrum, where we state that the error in velocity is approximately equal to 
the flux ratio of the contaminant to the true source spectrum multiplied by the velocity difference 
between the two spectra (equation [56|) . Namely, we assume that (Fc/Fs) ~ {Fc)/{Fs) (we will use 
(...) to represent the mean here for notational convenience). The same is assumed several times in 
the same section (equations [57] and [59]) . This approximation holds true provided that the fractional 
variation in the power spectrum in the denominator with wavelength is predominantly relatively 
small. We also make the same approximation regarding the division of mean visibilities in sections 
13.3.21 and 13.3.31 (equations [60l and [64l) . Here we show the validity of this approximation. 

Consider two arbitrary functions A{x) and B{x), which have fractionally relatively small vari- 
ations about their means so that we can define two corresponding functions, a{x) and b{x) such 
that 

A{x) = {A){l + a{x)) ; B (x) ^ {B) {1 + b{x)) , (CI) 
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where a,b <^ 1 for all x. Using the binomial expansion, we can write: 



^\ 


/{A){l + a)\ {A) /l + a\ 
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\{B){l + b)/ {B)\l + b/ 








= ^{{l + a)il-b + b'-b' + . 
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= ^{l-b + b^-b^ + ... + a- 
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-ab^ + 




- ^[^ + {b')-{ab)], 







(C2) 

where we have neglected terms of order 6^ and a6^ and smaller, and where we can also drop the 
terms (a) and (6), which must equal zero according to the definitions of a and b. 

In general (ab) — )■ if the functions a and b (and hence A and B) are uncorrelated. In the limit 
where the two functions are completely correlated (i.e., identical), then {ab) — t- (6^), and the {ab) 
and (6^) terms cancel, so that (A/B) — )■ {A)/{B). In the cases we are interested in, it is generally 
unlikely that {ab) <^ (anticorrelation), or that (ab) ^ (6^). Therefore, we can reasonably take 
(6^) as an estimate of the fractional error in the division approximation. Conveniently, from the 
definition of 6, (6^) turns out to be equal to the square of the normalized standard deviation, as, 
of the function B: 

Hence for functions with small fractional deviations from their respective means, the mean of 
the quotient is approximately equal to the quotient of the means, and {gb/ {B))"^ gives an estimate 
of the fractional error in the approximation. 

Tests with stellar spectra show the approximation works quite well, both at ET-like resolutions 
and with very high resolution synthetic spectra. Taking as an example R ^ 5000 spectra from 
the KPNO ET of 36 UMa (F8V) and r Get (G8V) (obtained by binning the fringing spectra 
along the slit direction), we find CTrCet/iFrCet) = 0.25, giving an estimated error of 6% due to the 
approximation. In practice, we find the ratio of (-F36UMa)/(-^rCet) to (i^36UMa/-^rCct) to be 0.980, i.e., 
a 2% difference. For an emission spectrum in the denominator, such as ThAr, the approximation 
will not hold well as there are significant regions of near-zero flux. However, it is unlikely that 
one would be interested in considering some contaminant spectrum against a primary emission 
spectrum source. Conceivably one might be interested in considering the effects of contamination 
from an emission source - e.g., from stray fluorescent lighting leakage, sky emission, or leakage from 
an reference lamp, but in this case the emission spectrum would be in the numerator: a stellar 
spectrum will always be the function in the denominator, and so the approximation should still 
hold. 

The second case where the approximation is employed is in considering visibility ratios, as in 
estimating the velocity error due to the reference whirl addition approximation, or estimating the 
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effect of residual interferometer comb (equations 1601 and I64p . In both cases here the function in 
the denominator is absolute visibility as a function of wavelength, a(A), for reference-multiplied 
stellar data (since the comb and cross-talk terms both appear in the formula for the combined 
star/reference data, equation I22p . Taking KPNO spectra as an example, the normalized variance 
of the combined star/iodine data tends to be around (o"Q,/(a))^ « 0.32, in other words giving 
an estimated error in the division approximation of around 30% for both comb and cross-talk 
error sources - not as precise as the approximation for contaminating spectra, but still useful for 
an order-of-magnitude error. In neither of these cases do we expect any correlation between the 
data visibility function in the denominator and the comb or cross-talk visibility function in the 
numerator, so the {ah) term should be small. 

Where there is concern over the assumptions made above (e.g., where the normalized variance 
of the denominator, {as/ {B))"^ is not small - which would, for example, be the case were the de- 
nominator to represent visibilities of a pure star spectrum, where the visibility distribution peaks at 
very low values), then the approximation derived here is not appropriate, and instead it is necessary 
to calculate (Fc/Fg) directly. Often this is in fact entirely practical; however the approximation can 
generally be used to give a very quick and convenient first-order estimate of contamination errors. 
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